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This  report  presents  the  results  of  a study  to  define  an  Adaptive  Program- 
mable Signal  Processor  (APSP)  suitable  for  on-board  satellite  processing  of 
data  generated  by  spacsborne  electro-optical  surveillance  sensors  and  dual 
mode  radars.  The  tasks  performed  included:  1)  definition  of  system  require 

ments  based  upon  mission  requirements  supplied  by  SAMSO,  2)  definition  of 
processor  performance  requirements,  3)  configuration  of  a processor 
architecture  to  meet  those  requirements,  4)  evaluation  of  present  and  pro- 
jected semi-conductor  device  technology  applicable  to  APSP  development. 
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1.  0 INTRODUCTION  AND  SUMMARY 


This  report  presents  the  results  of  the  Electro -Optica  1 Processor 

Definition  Task,  statement  of  work  item  3.  3,  of  the  Adaptive  Programmable 

2 

Signal  Processor  (ACCD  ) program. 

The  work  described  is  based  upon  information  and  requirements  con- 
tained in  the  program’s  Mission  Requirements,  Systems  Requirements,  and 
Processor  Requirements  documents,  prepared  earlier  in  the  program. 

Section  2.  0 of  this  report  summarizes  the  requirements  placed  upon, 
and  the  functions  to  be  performed  by,  the  APSP.  This  section  also  discusses 
the  issue  of  design  commonality  for  electro -optical  and  radar  processors. 

The  conclusion  is  that  there  exists  considerable  device  commonality,  e.g., 
both  types  of  processors  require  high  speed  multipliers;  moderate  functional 
commonality,  e.g.,  high  capacity  memories  are  used  by  both  processors; 
but  little  architectural  commonality,  i.e.,  those  common  devices  are  inter- 
connected in  completely  different  ways. 

Section  3.0  details  the  two  independent  processor  architectures  which 
w^re  developed,  then  merged  to  obtain  the  best  features  of  each.  The  sec- 
tion begins  by  describing  both  of  the  approaches,  and  concludes  with  a 
description  of  the  consolidated  architecture. 

Section  4.  0 contains  descriptions  of  each  of  the  functional  modules 
for  the  c"nsolidated  processor.  Included  are  register  level  diagrams  and 
signal  flow  charts.  Partitioning  of  functions  between  hardware  and  software 
is  also  treated  in  this  section. 

Section  5.  0 discusses  software  for  the  APSP  application,  particularly 
development  of  algorithms  for  the  multi-target  track  problem.  Track  initia- 
tion, maintenance  and  termination  criteria  are  treated,  along  with  inter-pixel 
boundary  problems.  Estimates  of  instruction  counts,  storage  requirements 
and  execution  times  are  included. 
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Z.  0 REQUIREMENTS  AND  FUNCTIONS 


Z.  1 REQUIREMENTS 

The  basic  functional  and  performance  requirements  were  delineated 
in  the  Performance  Requirements  Report,  dated  October  1975,  and  are 
summarized  in  Table  Z.  1.  The  APSP  accepts  data,  from  the  focal  plane  chip 
at  a 1 64K  samples/second  (1.64  x IQ4  detectors  sampled  at  a 10  Hz  readout 
rate),  and  after  performing  various  filter  functions,  including  both  temporal 
and  spatial,  will  track  potential  targets  and  output  state  vectors  at  the  rate 
of  one  per  second  per  track. 

Z.  Z COMMONALITY  BETWEEN  THE  RADAR  SIGNAL  PROCESSOR  AND 
THE  ELECTRO-OPTICAL  PROCESSOR 

The  primary  technical  factors  which  differentiate  the  radar  signal 
processor  task  from  that  of  the  electro-optical  sensors  result  from  the  rang- 
ing capability  of  the  radar,  which  is  not  available  in  the  passive  E-O  system, 
and  the  relative  rates  of  sampling  which  are  required.  (Tf  an  E-O  system 
were  to  incorporate  active  laser  ranging,  significantly  greater  commonality 
would  exist  in  the  signal  processor.)  Radar  data  must  be  gathered  at  inter- 
vals determined  by  the  transmitted  pulse  rate  and  at  time  increments  com- 
patible with  che  desired  range  resolution.  In  comparison,  the  MFPA  sensors 
receive  data  continuously  and  their  output  is  sampled  at  rates  determined  by 
target  and  background  variations  in  time. 

The  radar  processor  derives  considerable  capability  from  the 
coherent  nature  of  its  sensor  and  receiver  and  from  the  ability  to  resolve 
almost  microscopic  variations  in  the  doppler  shifts  of  the  target,  whereas 
the  optical  data  processor  performs  temporal  and  spatial  filtering  to  reduce 
the  dynamic  range  caused  by  noise  or  clutter  background.  The  adaptive 
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TABLE  2-1.  PERFORMANCE  REQUIREMENTS  SUMMARY 


Detector  channels 

4.  2 x 106 

Number  of  simultaneous  tracks 

5000 

Transient  dynamic  range 

103 

System  dynamic  range 

io7 

Maximum  input  data  rate 

1 . 6 MHz 

Clutter  rejection 

26  dB  at  VT/VC  = 5 

Velocity  discrimination 

0.  3 pixels /sec  AV  at 

3 pixels/sec 

Tracking  accuracy  (ltr) 

0.2  5 pixels 

Star  rejection 

1 00  percent 

Output  track  parameters 

X,  Y,  Vx>  V J,  ID 

Nominal  state  vector  update 

once  / sec. 

Target  velocity 

0-8  pixels  /sec 

10  year  radiation  dose 

104  rad(Si) 

Power  dissipation 

128  watts 

Ri 


features  of  the  electro-optical  processor  appear  strongly  in  the  front  end,  or 
at  the  sensor  array  itself,  whereas  the  only  comparable  features  in  the  radar, 
the  AGC  and  adaptive  thresholds,  are  mechanized  further  along  in  the  signal 
processing  chain 

Angle  tracking  circuits  in  the  two  processors  could  utilize  similar 
track  files  in  the  main  processor  memory  and  similar  algorithms  in  closed 
tracking  loops.  The  basic  angle  sensing  information  in  the  electro -optical 
sensor  originates  in  the  spatial  filtering  functions,  while  the  radar,  having 
only  a single  sensor  pointing  direction,  obtains  its  basic  directional  data 
without  any  significant  tracking  filter  process.  There  is  no  range  tracking 
discriminant  function  in  the  electro -optical  processor  which  is  comparable  to 
that  of  the  radar. 

Doppler  filtering  is  performed  in  the  radar  system  as  a means  of 
excluding  broad  band  noise  and  clutter  and  obtain  s ignal-to -noise  levels  suf- 
ficient for  purposes  of  detection.  This  coherent  integration  utilizes  narrow- 
band  filter  characteristics  with  inherently  low  sidelobe  response  in  order  to 
avoid  velocity  ambiguity  and  to  obtain  unequivocal  velocity  resolution.  The 
Fourier  transform  filter  is  a natural  choice  for  this  function.  In  the 
electro -optical  processor,  however,  other  filter  transforms  may  he 
useable  and  relatively  advantageous  since  the  transform  may  be  used  to  nar- 
row the  bandwidth  of  the  data  processor  for  a given  degree  of  target  detect- 
ability. Some  forms  of  the  Walsh  Hadamard  transform  (which  are  unsuitable 
for  radar  usage  because  of  undesirable  sidelobe  characteristics)  appear  to 
be  advantageous  for  E-O  data  processing  and  target  detection. 

These  transforms  may  be  effective  for  the  MFPA  because  they  are 
well  adapted  to  the  unique  character  of  its  data,  including  broad  near- 
uniform areas  of  clutter,  spacial  edges  and  large  dynamic  range  signals. 
Target  radial  velocity  has  only  a secondary  influence  on  the  E-O  tracker 
through  its  effect  on  the  observed  brightness,  while  in  the  radar  system  the 
target  velocity  doppler  effect  is  the  primary  factor  which  permits  detection 
in  clutter. 

These  considerations  are  important  in  assessing  the  degree  of  com- 
monality between  electro -optica  1 and  radar  processing  equipment. 
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The  AVE  (adaptive  video  encoder)  element  of  the  electro-optical 
processor  is  functionally  integrated  and  contains  on  the  sensor  chip  some  of 
the  basic  temporal  filter  functions.  Thus,  its  temporal  filter  function  would 
not  be  available  to  the  radar,  and  would  not  be  useable. 

The  output  of  the  Adaptive  Video  Encoder  provides  the  input  to  the 
layered  array  (LAP)  signal  processor.  The  basic  functional  blocks  of  the 
LAP  are  shown  in  Figure  2-1.  The  point  target  processor  performs  video 
integration,  compensation  for  sensor  sensitivity  variations  for  each  pixel 
element,  and  area  correlation  of  data  from  neighboring  pixels. 

The  tracking  processors  execute  target  tracking  algorithms  under 
the  gene  ml  control  of  the  APSP.  The  computations  include  a determination 
of  the  next  probable  position  for  tracks  as  well  as  algorithms  compensating 
for  the  gaps  caused  by  apparent  cessation  of  target  motion  gaps  in  the 
MFPA  chip  array.  Current  and  past  track  histories  are  also  maintained. 

The  APSP  controller  exercises  supervisory  control  over  the  LAP 
units.  It  determines  the  LAP  modes,  issues  commands  to  the  appropriate 
units,  and  assigns  t a rgets  to  specific  tracking  processors.  Track  and  target 
data  for  transmission  to  the  earth  is  selected  by  the  data  communication 
interface. 


STATUS  MONITORING  AND  COMMANDS 

Figure  2-1.  Func  tional  diagram  of  LAP. 
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Certain  portions  of  the  LAP  appear  to  be  useful  for  radar  signal 
processing,  with  minor  changes  or  additions.  Other  portions  provide  func- 
tions which  benefit  only  the  eleetro-optical  mission,  particularly  those 
elements  which  have  specialized  functions  unique  to  the  MFPA  sensor.  The 
point  target  processing  element,  for  example,  is  specialized  and  does  not 
perform  a process  useful  to  the  radar.  Data  reduction  prior  to  transmission 
is  not  required  for  the  radar  sensor,  however  the  angle  track  processing 
and  data  interface  might  have  a significant  utility  for  radar  signal  processing. 

Conclus  ion 

The  functional  correspondence  between  the  electro-f  ptical  processor 
and  that  of  the  radar  is  rather  limited.  The  functions  of  the  AVE  and  the 
point  target  processing  elements  do  not  resemble  the  needs  of  the  radar 
processor.  The  angle  track  element  may  be  useful  if  there  is  a requirement 
lor  radar  track  files,  but  at  this  time,  there  is  no  such  requirement.  The 
data  communication  interface  and  processor  control  would  probably  be 
serviceable  for  radar  functions. 

In  view  of  these  considerations,  the  tunctional  correspondence 
between  the  requirements  (and  therefore  the  architecture)  of  the  two  types  of 
signal  processing  is  not  great  enough  to  justify  a unified  design.  This  con- 
clusion does  not  apply  to  the  device  development  aspects  of  the  program 
since  the  great  preponderance  of  CCD  devices  proposed  for  future  design  can 
be  utilized  in  either  system. 

Excluding  the  MFPA  and  the  CCD  A/D  converter,  all  of  the  remaining 
devices  for  which  conceptual  designs  have  been  originated  appear  to  be 
mutually  useful.  This  includes  items  such  as  the  CCD  full  adder,  the  CCD 
D/A,  the  on-chip  clock  driver  and  input-output  semiconductor  design  efforts. 
In  addition,  both  processors  will  make  extensive  use  of  low  power,  high 
capacity,  digital  memory  devices.  Any  developments  which  can  meet  the 
specialized  requirements  of  very  low  power  and  long  life  for  the  optical 
processor  will  likely  be  of  substantial  benefit  to  the  radar. 


Thus,  the  commonality  between  the  optical  and  radar  is  limited  in 
terms  of  architecture  and  function,  but  there  appears  to  be  a significant 
degree  of  potential  joint  usage  of  specialized,  custom  designs  of  CCD  or 
other  low  power  devices. 


3.0  APSP  ARCHITECTURE 


3.  1 INTRODUCTION  4.10  BACKGROUND 

As  the  task  of  developing  an  architecture  for  the  APSP  progressed, 
it  became  clear  that  substantial  divergence  of  opinion  existed  among  com- 
petent technical  personnel  as  to  what  constituted  an  optimum  design.  Upon 
examination,  it  was  apparent  that  the  diversity  of  concepts  really  repre- 
sented variations  on  only  two  fundamental  approaches.  The  first  a pproach 
was  to  have  the  track  processor  request  the  digitized  data  for  specific  detec- 
tor elements  within  the  projected  tracking  gate  from  the  point  target  proces- 
sor. The  second  approach  was  a passive  signal  processor  which  performed 
temporal  and  spatial  filtering  on  the  digitized  data  from  all  detector  elements, 
and  passed  only  information  which  exceeded  a threshold  to  the  track 
processor. 

At  this  point,  two  independent  technical  teams  were  formed,  each 
charged  with  developing  the  "optimum1'  Adaptive  Programmable  Signal  Pro- 
cessor based  upon  the  iequirements  and  information  contained  in  two  pro- 
gram documents  prepared  earlier:  ' 

1.  The  Systems  Requirements  Report,  CDRL  A003 

2.  The  Proces so r Requirements  Report,  CDPLA004 

In  mid-November  a series  of  meetings  were  held  with  each  team  pre- 
senting, describing,  and  to  some  extent,  defending  its  approach.  These 
meetings  resulted  in  a detailed  examination  and  comparison  of  the  two 
approaches.  It  became  apparent  that  the  first  approach  required  a very  compli 
cated  switching  network  and  high  data  rates  in  the  tracker  communication  net- 
work. However,  several  of  the  novel  concepts  from  that  approach  such  as 
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adaptive  velocity  filters  and  tracking  processor  direction  of  the  saturation  con- 
trol logic,  have  been  maintained  and  incorporated.  Both  approaches  utilized 
the  same  Adaptive  Video  Encoder  (AVE)  discussed  in  Section  4.  1. 

The  Adaptive  Video  Encoder  and  both  of  the  initial  concepts  are 
described,  at  the  system  level,  in  this  section  of  this  report  in  compliance 
with  the  contractual  requirements  that  program  reports  descr  j - "all  work 
performed,  knowledge  gained,  and  results  achieved".  Howe\  r,  only  the 
amalgamated  design  was  further  refined  in  the  register  level.  Section  3.  5 
of  this  report  discuss  that  design. 

3.  2 THE  LAYERED  ARRAY  PROCESSOR  (A) 

This  section  describes  the  configuration  proposed  by  the  first  of  the 
two  independent  teams. 

. Basic  Processor  Configuration 

The  basic  configuration  is  thown  in  Figure  3.  2-1.  The  point  target 
processing  function  includes  uncorrelated  pixel  processing  (1st  layer)  and 
correlated  pixel  processing  (2nd  layer)  for  improved  clutter  rejection. 

The  track  processing  function  implements  tracking  algorithms.  It  includes 
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Figure  3.  2-1.  Processor  A functional  block  diagram. 
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the  data  communication  interface  (3rd  layer)  and  the  array  of  computing 
elements  (4th  layer)  which  execute  the  individual  tracks.  The  entire  layered 
array  processor  is  under  executive  control  of  the  computers  in  the.  control 
section  which  implement  le  changes  and  coordinate  the  actions  of  individ- 
ual trackers.  Reports  oi  -rent  tracks  are  relayed  to  the  ground  through 
the  data  link.  Primary  processor  commands  are  in  the  spacecraft  control 
section,  and  alarms  and  reports  on  the  condition  of  the  processor  are 
reported  to  spacecraft  control. 

The  designs  must  be  expandable  t near  term  applications  of  4 x 10 
pixels  with  eventual  applications  of  108  picture  elements  (pixels).  Each 
pixel  has  a position  (i,  j,  ),  and  a magnitude  (q)  associated  with  it.  The  size 
of  a pixel  equals  the  detector  instantaneous  field  of  view.  Each  detector  chip 
is  assumed  to  contain  an  array  of  128  x 128  pixels.  A small  amount  of 
insensitive  space  (2  to  5 pixels  width)  is  assumed  between  detector  chips. 

A 16  x 16  chip  sensor  array  (4  x 10  pixels)  can  be  mounted  on  a single 
8"  x 8"  substrate.  Applications  with  more  pixels  will  require  multiple 
sensor  substrates  and  larger  spaces  (10  to  30  pixels  width)  will  be  assumed 
between  substrates. 

The  AVE  provides  amplitude  reconstructed  data  (10  bit)  which  has 
A maximum  transfer  rate  oi  100  frames  per  second  is  assumed  into  the  point 
detectors.  An  algorithm  to  reduce  sensor  impulse  ncise  is  also  in  the  AVE. 

A consta?  transfer  rate  of  100  frames  per  second  is  assumed  into  the  point 
target  proe  ssor.  Lower  effective  frame  rates  are  obtained  in  the  LAP  by 
digital  time  integration,  as  appropriate  for  detection  over  specific  target 
velocity  ranges.  The  number  of  hardware  data  channels  used  will  be  selec- 
ted to  be  compatible  with  the  degrees  of  parallelism  in  the  AVE  and  target 
processor.  4 he  interfaces  with  spacecraft  control  and  the  data  link  are 
general  purpose  computer-type  block  transfer  ports. 

Figure  3.2-2  illustrates  the  LAP  from  a different  viewpoint.  Proces- 
sing for  one  detector  chip  (128  x 128  pixels)  is  emphasized.  With  the 
capability  for  a high  frame  rate  (100  frames /sec.)  and  the  necessity  for 
several  frames  of  digital  storage,  several  first  and  second  layer  processing 
chips  are  needed  for  each  detector  chip.  Sixteen  first  layer  and  sixteen 
second  layer  are  shown.  One  or  more  extra  first  layer  chips  will  be 
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Figure  3.2-2.  Processing  hardware  for  a 
128  x 128  pixel  de  ector  array. 


provided  for  redundancy  to  improve  fault  tolerance,  while  careful  layout  of 
second  layer  chips  should  reduce  the  requirement  to  eight  per  detector  c ,p. 
The  dashed  lines  show  one  way  that  the  data  can  be  divided  t'oi  scco  y 

processing  <u2  X 32  pixel  squares).  However,  8 x 128  pixel  rectangles 

should  prove  more  efficient. 

Target  Processing 

Target  processing  is  shown  functionally  in  Figure  3.  . >■  8 

amplitude  is  input  from  the  AVE.  Globally  selectable  digital  time  integra- 
tion allows  the  frame  integration  time  to  be  set  to  optimise  the  detection  of 
expected  rapidly  changing  targets.  Data  can  then  be  passed  through  area 
correlation  (spatial  filtering).  Area  correlation  is  helpful  in  distinguishing 
point  targets  from  distributed  clutter  such  as  clouds  or  some  types  of  sun 
glint.  However,  it  will  not  help  when  tracking  targets  against  a star  back- 
ground. When  a specific  element  of  the  second  layer  array  la, Is,  ope.  ating 
L corresponding  first  layer  chip  without  area  correlation  is  an  appropriate 

degraded  mode  of  operation. 
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Figure  3.  2-3.  Point  target  processing. 


The  signal  path  then  divides.  One  branch  goes  through  change 
measurement  (time  filtering)  for  fast  targets.  This  is  optimized  (via  wider 
bandwidth)  to  provide  maximum  sensitivity  for  fast  targets.  The  other 
branch  simultaneously  provides  additional  digital  time  integration  to  supply 
data  for  change  measurement  for  slow  targets.  Separate  adaptive  thresholds 
are  maintained  for  both  fast  and  slow  target  detection.  These  are  deter- 
mined independently  and  dynamically  for  each  pixel.  Thresholds  are  deter- 
mined from  the  apparent  noise  level  at  the  target  detector.  The  adaptive 
threshold  feature  can  adjust  to  changes  in  target  detectability  due  to  different 
clutter  conditions  in  different  parts  of  the  image  and  due  to  different  hard- 
ware conditions  such  as  noisy  sensor  cells  or  the  absence  of  sensors  on  lines 
between  sensor  chips.  The  diagram  also  shows  some  of  the  special  logic 
necessary  for  efficient  self  test. 
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Time  Integration 


Time  integration  of  incoming  data  is  performed  to  increase  the  signal 
to  noise  ratio  and  reduce  the  effect  of  transients.  The  integration  is 
performed  in  two  stages  to  allow  detection  of  both  fast  and  slow  targets.  To 
integrate  the  data  for  fast  target  detection,  1 to  16  samples  of  data  for  each 
pixel  are  summed.  The  second  integration  then  adds  from  2 to  8 of  these 
previous  integration  summations  to  allow  detection  of  slowly  changing  targets. 
Thus,  for  slow  target  integration,  summations  of  up  to  128  samples  of  incom- 
ing data  are  possible.  The  number  of  samples  summed  by  each  integration 
is  independently  selectable  under  global  control.  Four  guard  bits  are  pro- 
vided for  the  first  integration  and  three  are  supplied  for  the  second  to  prevent 
overflow.  The  integrated  sums  are  rounded  off  to  16  bits  and  scaled  to 
assure  that  the  most  significant  bits  are  transferred  regardless  of  the  number 
of  samples  summed.  The  results  of  the  first  integration  are  used  to  generate 
the  area  correlation  data  which  is  used  to  detect  both  fast  and  slow  targets. 

A functional  implementation  of  the  first  integration  is  shown  in  Fig- 
ure 3.  2-4.  Data  entering  from  the  input  buffer  is  added  to  the  temporary 
sum  that  is  kept  in  the  memory  for  each  pixel.  The  adder  is  bit  serial  and 

represents  a minimal  amount  of  extra  area  for  the  chip.  The  select  unit  is 
capable  of  selecting  the  temporary  sum  for  normal  integration,  zero  for 
initiating  a new  sum,  or  a shifted  version  of  the  sum  at  the  completion  of  an 
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Figure  3.2-4.  Time  integration  I. 


3-6 


integration  to  accomplish  scaling.  The  selection  is  performed  by  global 
control.  The  memory  contains  1024  words,  each  of  v/hich  has  4 guard  bits 
to  prevent  overflow.  At  the  completion  of  a summation  sequence,  round-off 
is  accomplished  by  adding  a roundoff  bit  to  the  least  significant  bit  position 
of  the  16  most  significant  bits  produced  by  scaling.  The  final  summations  for 
each  pixel  are  transferred  to  the  area  correlation  and  change  measurement 
filters  for  further  processing. 

Spatial  Filtering 

Spatial  filtering  is  required  to  help  distinguish  moving  point  targets 
from  changing  backgrounds  such  as  those  from  moving  clouds  or  changing  sun 
glint  patterns.  The  key  feature  which  allows  discrimination  is  that  the 
changing  background  patterns  are  correlated  over  a number  of  adjacent  pixels. 
This  is  not  the  case  for  point  targets. 

The  processing  for  spatial  filtering  considers  each  pixel  along  with 
its  eight  neighbors  as  shown  in  Figure  3.  2-5.  The  neighbors  are  the  4 pixels 
on  the  edges  (the  E's)  and  the  4 pixels  on  the  diagonals  (the  D's).  Symmetri- 
cal filters  are  assumed.  The  relation  is  appropriate  for  computing 
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Figure  3.2-5.  Spatial  filtering 
concept. 


PIXEL  LOCATIONS 
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background  references  to  be  used  for  time  change  measurement.  The  P 

relation  evaluates  a property  which  could  be  called  peakedness  at  the  central 

point.  This  is  the  value  of  the  central  point  less  an  average  predicted  for 

that  central  point  based  on  the  eight  neighbors.  The  W„,  W_,  and  W_  are 

it,  D 

weighting  constants  which  will  be  selected  to  provide  good  filter  responses. 
The  calculations  are  repeated  for  all  pixels  as  central  points.  For  pixels  on 
the  edges  of  sensor  chips,  the  available  neighbors  are  nsed  with  different 
W's. 

Hardware  for  area  correlation,  provided  as  a second  layer,  is  illus- 
trated in  Figure  3.  2-6,  Data  for  area  correlation  is  stored  in  up  to  four  16 
bit  words  for  each  of  1024  pixels.  Inputs  to  this  shift  register  memory  are 
selected  from  either  of  two  first  layer  chips  (for  fault  tolerance)  or 
recirculated  for  further  correlation  calculations.  Correlation  arithmetic  is 
performed  in  pipelined  parallel  fashion  on  the  central  pixel  being  processed 
and  eight  neighbors.  Correlation  o-r  averaging  of  more  distant  pixels  can  be 


Figure  3.2-6.  Second  layer  block  diagram. 
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accomplished  in  multiple  passes.  Each  neighbor  is  automatically  selected  by 
connection  of  shift  register  taps  whose  location  corresponds  to  neighbors' 
array  positions.  The  taps  for  the  4 data  words  of  each  neighbor  pixel  enter 
an  adder  input  select  multiplexer.  This  allows  any  neighbor  pixel  word  (field) 
to  be  operated  on,  selectable  by  global  control.  Appropriate  input  lines  from 
adjacent  chips  may  also  be  selected  where  neighbor  pixels  cross  chip  boun- 
daries. The  corresponding  chip  output  words  are  also  selected  for  use  by 
neighbor  chins  as  such  input  data.  Compensation  for  sensor  gaps  may  be 
accomplished  by  substituting  central  pixel  or  fixed  values  on  the  neighbor 
chip  input  lines. 

Second  layer  operational  capabilities  are  indicated  in  Figure  3.2-7. 

An  opeiation  is  performed  on  all  pixels  in  a single  pass,  with  correlation  to 
all  8 neighbors.  Use  of  4 words  per  pixel  allows  prior  and  newly  updated 
pixel  values  to  co-exist  in  memory.  The  first  pixel  to  be  processed  in  an 
array  will  require  the  most  recently  updated  neighbor  data  from  the  previous 

• PERFORMS  ADD,  SUBTRACT,  MULTIPLY,  ARITH  RIGHT  SHIFT,  SET  FLAG  ON  COND 

• OPERATES  ON  (B)  NEIGHBORS  AND  CENTRAL  PIXEL  IN  PARALLEL  NEIGHBORS  CAN 
BE  ON  ADJACENT  CHIPS,  2ND  LAYER  COMPENSATES  FOR  SENSOR  GAPS 

• EACH  OPERAND  MAY  BE  ANY  OF  (4)  PIXEL  DATA  FIELDS 

• CONSTANT  MAY  ALSO  BE  USED  FOR  OPERAND 

• SELECTSOPERANDS  FROM  COMMON  PROCESSING  FRAME  TIME 
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Figure  3.  2-7.  Second  layer  arithmetic  unit. 


3-9 


, — 


frame.  The  last  pixel  to  be  processed  in  the  frame  will  still  require 
neighboring  data  from  the  previous  frame  even  though  updated  data  is  now 

available  also. 

First  layer  data  will  be  input  continuously  aj  processing  is  being 
performed  on  data  from  the  previous  frame.  Thus,  when  a pixel  enters  the 
arithmetic  unit  (one  frame  after  being  input  from  first  layer)  all  neighbor 
pixels  have  also  been  stored  in  second  layer  memory.  Data  is  processed  to 
16  bit  accuracy  with  rounding  and  scaling. 

Change  Measurement 

Change  measurement  determines  the  differences  between  time 
integrated  data  and  predicted  values  based  on  time  integrated  area  correlated 
data.  Change  measurement  is  accomplished  for  each  individual  pixel.  Pro- 
gramming by  global  control  allows  changes  to  be  computed  for  pixel  data 
separated  in  time  by  any  number  of  time  integration  frames.  As  indicated 
on  the  hardware  block  diagram  (Figure  3.  2-8),  data  may  be  recirculated  until 
the  desired  time  difference  for  change  computation  occurs. 

The  hardware  sums  past  and  future  area  correlated  data  to  provide 
an  estimate  of  the  current  background  value.  Future  data  is  taken  from  the 
second  layer  as  needed  with  no  time  delay.  Past  data  is  delayed  for  2 time 
difference  periods  by  the  2 memories  shown.  The  summed  area  correlated 
past  and  future  data  (16  most  significant  bits)  are  then  subtracted  from  the 
present  value  which  has  been  delayed  one  time  difference  period.  (The 
present  value  may  be  delayed  one  additional  frame  to  compensate  for  area 
correlation  delay). 

If  failures  in  the  second  layer  cannot  be  compensated  by  redundant 
chips,  time  integrated  data  without  area  correlation  maybe  selected  to  pro- 
vide an  estimate  (without  use  of  second  layer  data)  at  somewhat  degraded  per- 
formance levels.  This  configuration  is  also  useful  for  deep  space  tracking 

where  area  correlation  is  unnecessary. 

Two  change  measurement  units  are  provided  to  allow  independent 
change  measurement  for  fast  and  slow  targets.  The  number  of  time  integra- 
tion frames  between  change  values  may  be  independently  varied  for  slow  and 
fast  targets.  One  change  detection  unit  allows  additional  integration  of  both 
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. ADDITIONAL  PATHS  FOR  TIME  INTEGRATION  II  ARE  ADDED  TO  ONE  OF  THE  TWO  CHANGE  MEASUREMENT  UNITS 


Figure  3.2-8.  Change  measurement. 

... 

(previously  integrated)  pixel  and  area  correlated  data,  providing  simultane- 
ous separate  globally  controlled  integration  times  for  fast  and  slow  targets. 
Integration  of  up  to  128  samples  can  thus  be  accomplished. 

I 

' 1 

Adaptive  Thresholding 

Adaptive  thresholding  provides  a variable  threshold  for  each  pixel  for 
target  detection.  The  threshold  is  based  on  a scaled  average  of  magnitudes 
of  several  previous  differences  between  estimated  and  measured  values. 

The  threshold  may  be  a summation  of  1 to  32  previous  difference  magnitudes 
which  are  scaled  (through  division  by  1,  1/2,  or  1/4)  by  shifting.  The  number 
of  differences  summed  and  the  scaling  factor  are  selectable  under  global 
control.  The  threshold  may  be  set  in  increments  of  1/4  difference  to  any 
number  of  differences  up  to  8.  Additionally  9 to  32  differences  may  be  used 
as  a threshold  with  a coarser  increment  of  threshold  selection.  Scaling  by 
1/8  may  be  added  to  provide  finer  threshold  selection  of  up  to  16  differences. 

A constant  may  be  selected  under  global  control  instead  of  the  previous 


3-11 


difference  sum,  or  as  a minimum  level  to  be  used  with  a previous  difference 
sum. 

The  adaptive  threshold  hardware  (Figure  3.2-9)  stores  the  previously 
computed  threshold  value  for  use  while  differences  are  being  summed  to  com 
pute  the  new  threshold.  A target  hit  (threshold  exceeded)  will  cause  thresh- 
old summation  of  the  pixel  containing  the  hit  to  be  disregarded.  A IK  x 1 bit 
memory  (not  showny  stores  the  hit  data.  Absolute  value  is  obtained  by 
selecting  true  or  one's  complement  change  detection  data  and  providing  a 
carry  in,  if  necessary,  to  the  difference  summation  adder. 

Independent  threshold  computations  are  performed  for  fast  and  slow 
targets.  Threshold  levels  (number  of  differences  and  scaling)  may  be  set 
independently  for  fast  and  slow  targets. 
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Figure  3.  2-9.  Adaptive  thresholding  functional  diagram. 
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Burst  Location 

Burst  centroid  location  can  be  accomplished  in  the  sequence  shown 
below.  In  a burst  location  mode,  one  or  multmle  burst  centroid  locations  are 
determined  by  parallel  second  layer  processing. 

• Burst  location  mode  triggered  by  burst  detector 

• Second  layer  sets  flag  for  each  saturated  pixel 

• Number  of  saturated  pixels  in  each  row,  column  summed  by 
second  layer 

• Centroid  locations(s)  at  intersection  of  row  and  column  with 
greatest  number  of  saturated  pi-  els 

• Events  in  sensor  array  gaps  may  be  located  by  effects  on  edge 
pixels 

Point  Target  Processor  Fault  Tol e r a n c e 

Fault  tolerance  is  achieved  in  the  Point  Target  Processor  by  periodic 
tests  under  global  control  which  isolate  any  faulty  chips.  Operational  redun- 
dant chips  are  switched  by  global  control  to  repla.  e those  chips  found  to  be 

faulty.  Fault  tolerant  features  for  the  point  target  processor  are  summarized 
below. 

Self  test  of  first  and  second  layer  chips  is  performed  using  techniques 
of  the  Advanced  Avionics  Fault  Isolation  System  (AAFIS ),  dm  ed  under 
government  contract  . AAFIS  utilizes  test  p Ctern  generator  ontained 
within  units  under  test  to  provide  self  test.  .est  responses  (chip  outputs) 

for  the  entire  test  sequence  are  reduced  to  a single  code  word  which  may  be 
compared  to  the  correct  coded  test  response.  A test  response  code  checker 
is  provided  on  each  unit  (chip)  to  be  isolated.  Pseudorandom  test  pattern 
generation  and  response  coding  are  very  economically  implemented  with  CCD 
shift  registers,  and  will  constitute  less  than  0.2  percent  of  first  and  second 
layer  chip  logic. 

Pseudorandom  test  patterns  are  generated  on  each  first  layer  chip 
by  feedback  shift  register  hardware  as  shown  in  Figure  3.  2-10.  A shift 

N.  Benowitz,  D.F.  Calhoun,  G.  E.  Alderson,  J.  Bauer,  C.  T.  Joeckel, 

"An  Advanced  Fault  Isolation  System  for  Digital  Logic",  IEEE  T ans 

Computers  Vol.  C-24  No.  5,  May  1975,  p.  489-497. 
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16  BIT  pseudo  random  patterns 


GENERATOR  POLYNOMIAL  1 + X + X"*  + X12  + X16  (16  BITS) 

Figure  3.2-10.  Pseudo- random  pattern  generator. 

register  of  N bits  can  generate  up  to  2^-1  pseudorandom  patterns  of  fixed 
sequence.  Pseudorandom  patterns  will  efficiently  and  thoroughly  test  the 
arithmetic,  shift  register  memory,  and  select  logic  implemented  in  first 
and  second  layer  chips.  Remaining  test  inputs  will  be  provided  by  global 
control.  Global  control  inputs  to  the  chips  will  be  varied  during  the  test 
to  verify  operation  of  all  globally  controlled  chip  functions. 

The  patterns  thus  generated  circulate  through  first  and  second  layer 
chips.  For  isolation  purposes,  feedback  from  second  to  first  layer  is 
disabled. 

Each  first  and  second  layer  chip  contains  a response  code  generator. 

The  cyclic  code  generator  shewn  in  f igure  3.2-11  is  ideally  suited  to  CCD  shift 
register  implementation.  All  chip  outputs  are  serially  entered  into  the  code 
generator  for  each  test  pattern.  The  cyclic  code  checker  codes  its  input 
data  stream  by  considering  this  binary  data  to  be  a polynominal  and  dividing 
it  by  a polynomial  implemented  in  code  checker  hardware.  The  final  code 
word  is  the  remainder  of  the  division.  Any  errors  in  the  data  checked, 
including  r lultiple  bit  errors,  will  be  detected  unless  the  remainder  of  the 
erroneous  data  stream  is  the  same  as  that  produced  by  the  good  data  stream. 
Nearly  all  erroneous  outputs  will  be  detected  with  only  (1/2)  of  errors 
undetected  for  an  N bit  division  polynomial.  The  16  bit  cyclic  code  checker 
shown  in  Figure  3.2-11  will  detect  99.  93  percent  of  erroneous  test  responses. 
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PRIMITIVE  POLYNOMIAL  1 + X + X3  + X12  + X16  116  BITS) 


Figure  3.2-11.  16-bit  serial  cyclic  code  pattern  checker. 

If  feedback  from  other  chips  is  eliminated,  examination  of  each  chip 
code  response  serves  to  isolate  faults  to  one  chip.  As  shown  in  Table  3.  2-1, 
a "fail"  first  (or  second)  layer  chip  test  result  directly  indicates  the  failed 
chip  if  the  corresponding  second  (or  first)  layer  chip  has  passed  the  test. 

If  both  a first  layer  chip  and  its  associated  second  layer  chip  tests  are  failed 
either  1)  a first  layer  chip  failure  may  have  propagated  erroneous  data  into  a 
correctly  functioning  second  layer  chip,  or  2)  both  first  and  second  layers 
chips  may  be  faulty.  A second  test  (Test  2)  may  then  be  performed,  switch- 
ing the  second  layer  chip  to  a known  operational  first  layer  chip.  This  test 
will  indicate  whether  both  chips  or  the  first  layer  chip  only  were  faulty. 

Upon  detection  of  faulty  chips,  fault  tolerance  is  provided  by  elec- 
tronically substituting  redundant  operational  chips  for  those  which  have  failed. 
This  is  accomplished  by  fixed  interconnection  of  redundant  chips  in  the  first 
and  second  layers;  e.g,,  one  redundant  chip  per  4x4  array  serving  an  MFPA 
chip.  As  shown  in  the  example  of  Figure  3.  2-12,  any  faulty  chip  may  be 
replaced  by  the  chip  below  it.  Each  chip  below  the  faulty  chip  is  switched  to 
handle  the  processing  normally  performed  by  the  chip  above  it,  with  the 
redundant  chip  handling  computations  normally  performed  by  the  last  chip. 

An  additional  redundancy  feature  is  the  ability  to  perform  degraded 
accuracy  computations  in  the  event  of  second  layer  failure.  Here  the  feed- 
back from  the  second  to  the  first  layer  chip  is  disabled  and  first  layer  data 
substituted  for  area  correlated  second  layer  data. 
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TABLE  3.2-1.  ISOLATING  TO  A FAULTY  POINT  TARGET 

PROCESSOR  CHIP 


First  Layer 
Chip 

Second  Layer 
Chip 

Faulty  Chip 

Test  1 

Pass 

Pass 

None 

Pass 

Fail 

Second  Layer 

Fail 

Pass 

First  Layer 

Fail 

Fail 

Examine  Second  Test  Result 

Test  2 

. 

Pass 

First  Layer 

- 

Fail 

First  and  Second  Layer 

TEST  MODE  RESPONSE  CODING 
1 CODE  WORD  REPRESENTING 
AIL  CHIP  RESPONSES  TO 
COMPLETE  TEST  SEQUENCE 


Figure  3.2-12.  Point  target  processor  fault  tolerance 


i 
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Hardware  Estimates  for  First  and  Second  _Lajrer 

Table  3.2-2  shows  the  hardware  required  in  a first  layer  chip  capable 
Of  performing  all  time  integration,  change  measurement,  and  adaptive  thresh- 
old functions  in  parallel  for  fast  and  slow  targets.  One  chip  provides  all  first 
level  processing  for  1024  pixels.  The  total  chip  area  requirements  are  within 
projected  capability  of  250,  000  shift  register  bits  per  chip.  Only  7 input/output 

TABLE  3.2-2.  FIRST  LAYER  HARDWARE  REQUIREMENTS 
(First  Layer  Chip  to  Process  1024  Pixels) 


Proces  sing 
Function 

Shift  Register 
Memory 

Memory 

Bits 

1 

Full  Adders  for 
Parallel  Operation 

Input  Buffer 

32  x 16 
IK  x 16 

16, 896 

- 

Time  Integration  1 

IK  x 20 

20,  480 

20 

Change  Measurement 
(Fast  Targets) 

IK  x 16 
IK  x 16 
IK  x 16 

49,  152 

32 

Change  Measurement 
(Slow  Targets) 

Time  Integration  II 

IK  x 19 
IK  x 19 
IK  x 19 

58, 368 

64  - 76 

Adaptive  Threshold 
each  of  2 units 

IK  x 1 
IK  x 13-17 
IK  x 13-17 

27, 648- 
35,  840 

21  - 29 

Self  Test 

1 x ~12 
lx  -16 

-28 

4-6 

Total  for  Each  Iden- 
tical First  Layer 
Chip 

Tactical  (non  control) 
I/O 

200,  220- 
216,  604 
5+2 

162  - 192 

This  total  is  within  currently  projected  CCD  capability  of  250,  000  shift 
register  bits  per  chip. 


lines  are  needed  (including  fault  tolerance  provisions)  in  addition  to  control 
line(s),  power  ground,  and  clock  lines.  A 32  x 32  pixel  array  offers  sym- 
metry; however,  a 128  x 8 array  on  each  chip  reduces  the  second  layer  chip 

I/Os. 

Table  3.  2-3  summarizes  the  hardware  required  for  alternate  second 
layer  chips  to  handle  area  correlation  of  1024  and  2048  pixels,  respectively. 

The  number  of  adders  required  depends  upon  the  degree  of  parallelism  required 
Numbers  shown  are  for  fully  parallel  operation.  I/O  requirements  for  tactical 


TABLE  3.  2-3.  SECOND  LAYER  HARDWARE  REQUIREMENTS 
(Second  Layer  Chip  to  Process  1024  Pixels) 


Proces  sing 
Function 


Area  Correlation 
Burst  Location 


Shift  Register 
Memory 


Self  Test 


IK  x 16 
4 words 

l x ~16 


Total  for  each  identical 
second  layer  chip 

Tactical  (non  control)  I/O 


13+9  or  6+3 
Second  Layer  Chip  to  Process  2048  pixels 


Memory 

Bits 


65,  536 


~16 


65,  552 


Full  Adders  for 
Parallel  Operation 


16  x 8 = 128 


2 - 4 


130  - 132 


Area  Correlation 
Burst  Location 


Self  Test 


IK  x 16 
4 words 
2 mem  - 

0 ries 

1 x ~16 


13  1,  072 


Total  for  each  identical 
second  layer  chip 

Tactical  (non  control)  I/O 


T6 


131,  088 


16  x 8 x 2 = 256 


2-4 


258  - 260 


13  + 9 or  6 + 3 


These  totals  are  within  projected  CCD  capability  of  25  0,  000  shift 
register  bits/chip 
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signals  are  reduced  from  22  to  9 (including  fault  tolerance)  if  a 128  x 8 
(or  64  x 16)  array  is  used  in  place  of  a 32  x 32  pixel  array. 


Target  Tracking 


Detected  targets  are  first  assigned  to  a track  iile  which  is  maintained 
by  a tracker  processor.  Periodically  each  track  file  requests  data  on  its 
assigned  target  to  update  its  track  parameters.  It  is  assumed  for  the  purpose 
of  analysis  that  each  track  file  has  access  to  all  pixels  of  data  in  the  first 
layer  of  the  point  target  processor.  Furthermore,  a maximum  of  5000  tar- 
gets (including  false  trial  tracks)  must  be  tracked  simultaneously.  To  enable 
a preliminary  design  of  the  system  it  is  estimated  that  each  tracker  proces- 
sor is  capable  of  handling  10  track  files,  thus  requiring  that  a total  of 
1)00  tracker  processors  be  provided.  The  medium  sized  system  (4  x 10^  pixels) 
which  was  used  for  design  purposes  requires  that  the  track  files  obtain  data 
from  4000  first  layer  subarrays.  Each  subarray  contains  data  for  1024  pixel 
elements.  Since  500  processors  must  have  access  to  4000  subarrays  during 
each  frame,  a communication  network  must  be  provided. 

The  most  straightforward  design  approach  is  to  provide  a single  bus 
as  shown  in  Figure  3.2-13.  The  word  transfer  rates  ( 1 MHz)  demand  that  sepa- 
rate parallel  16-bit  buses  be  provided  for  the  transfer  of  target  addresses  to 
the  subarrays  and  the  return  of  data  to  the  trackers.  The  basic  configuration 
of  each  bus  network  is  shown  in  the  figure  and  includes  a data  selector  array, 
a bus  driver,  a bus  receiver,  a fanout  buffer,  and  a central  bus  controller. 

The  data  selector  determines  which  track  file  request  (or  return  data)  is 
placed  on  the  bus.  The  data  is  transmitted  over  a 16-bit  parallel  bus  via 
drivers  and  receivers.  The  fanout  buffer  distributes  the  data  to  all  subarrays 
(or  tracker  processors)  so  that  the  appropriate  one  can  identify  and  receive 
it.  The  bus  controller  is  responsible  for  controlling  and  sequencing  all 
operations . 

An  alternate  approach  to  the  design  of  the  tracker  communication  net- 
work is  a multilevel  bus.  An  example  of  such  a network  is  given  in  Figure 
3.  2.-14.  The  network  is  composed  of  several  levels  of  bus  elements  which 
form  a sort  of  matrix.  The  elements  in  each  level  are  connected  only  to 
those  in  the  next  level.  All  buses  are  serial  to  reduce  the  interconnections. 
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• THE  MULTILEVEL  MATRIX  CONFIGURATION  PERMITS  A HIGH  DEGREE  OF  FAULT  TOLERANCE  AND  PERMITS 
HANDLING  NON-UN  IFOR  ML  Y DISTRIBUTED  DATA 

• NETWORK  ELEMENTS  MAY  BE  CROSSPOINT  SWITCHES  OR  STORE  AND  FORWARD  UNITS 


Figure  3.2-14,  The  multilevel  bus  approach. 


Redundant  connections  are  easily  provided  to  increase  the  fault  tolerance 
and  provide  multiple  access  paths  to  all  subarrays.  The  latter  feature  prevents 
overloading  of  a subset  of  buses  when  target  clustering  occurs  in  a small 
number  of  subarrays.  It  also  reduces  the  data  rate  through  any  one  bus  so 
that  serial  buses  can  be  used.  The  bus  elements  may  be  either  crosspoint 
switches  or  store  and  forward  units.  A separate  multilevel  bus  network  is 
needed  for  both  the  address  and  return  data  buses.  The  number  of  bus  ele- 
ments required  for  tm.  type  of  network  is  dependent  upon  the  following  fac- 
tors: number  of  tracker  processors,  number  of  f'rst  level  subarrays,  worst 
case  target  clustering,  maximum  number  of  I/Os  permitted  per  chip,  and  the 
amount  of  redundancy  desired. 

Table  3.2-4  compares  the  characteristics  of  the  conventional  parallel 
(CP)  and  multilevel  serial  (MS)  bus  approaches  for  the  design  of  the  tracker 
communication  network.  A CP  bus  design  that  contains  no  redundancy  requires 
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TABLE  3 2-4.  COMPARISON  OF  CONVENTIONAL  PARALLEL 
TABLE  multilevel  SERIAL  APPROACHES 


Conventional  Parallel  Bus 

Multilevel  Serial  Bus 

The  parallel  16  bit  configuration 
is  needed  to  satisfy  the  data  rates 
set  by  the  update  rate  and  the  num- 
ber of  track  files. 

Due  to  the  number  of  identical 
network  elements  serial  data  buses 
will  satisfy  the  required  data  rates. 

To  achieve  fault  tolerance  redun- 
dcint  units  and  multiple  parallel 
buses  must  be  provided.  Thus  a 
significant  hardware  increase  is 
necessary 

Fault  tolerance  can  be  readily 
incorporated  into  the  basic  network 
configuration  at  the  expense  of 
nominal  extra  hardware  and 
interconnections. 

Input  buffers  on  the  subarrays  must 
be  provided  to  handle  nonuniform 
target  distribution. 

The  basic  design  of  the  network 
takes  into  account  nonunifoxm  tar- 
get distributions. 

A non-redundant  design  requires 
approximately  5 000  chips  for  a 
4 x 10^  pixel  system. 

A fully  redundant  design  requires 
<5000  chips. 

6 chip  types  are  required. 

3 chip  types  are  required. 

Approximately  200,  000  intercon- 
nections are  needed  for  a non- 
redundant  design. 

Approximately  25,  000  interconnec- 
tions are  needed  for  a fully  redun- 
dant design. 

The  chips  are  mostly  low  to  medium 
complexity. 

The  chips  are  medium  to  high 
complexity. 

The  functional  design  utilizes  con- 
ventional techniques. 

In  order  to  optimize  the  design 
simulation  should  be  used. 

greater  than  5000  chips  for  a 4 x 106  pixel  system.  A MS  bus  design  that  is 
fully  redundant  requires  approximately  3500  chips.  A less  conservative 
design  requires  only  half  that  amount  (1750  chips).  Two  criteria  were  used 
in  obtaining  the  chip  estimates  for  both  design  approaches.  The  maximum 
chip  complexity  was  limited  to  1000-1500  gate  equivalents,  and  the  maximum 
number  of  chip  I/Os  was  restricted  to  20.  (If  2200  maximum  targets  are 
assumed  for  the  medium  sized  system  instead  of  5000  the  following  chip 
estimates  for  a MS  bus  design  are  obtained:  3000-3500  chips  for  a conserva- 

tive design  and  1500-1750  chips  for  a less  conservative  approach.  ) Tne 
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CF  bus  approach  that  contains  no  redundancy  requires  approximately 
200,  000  intercor.  .ections  to  connect  all  the  chips,  compared  to  25,  000  inter- 
connections for  a fully  redundant  MS  bus  design.  The  chips  for  the  CP  design 
are  expected  to  be  of  low  to  medium  complexity  while  those  of  the  MS  bus 
approach  are  of  medium  to  high  complexity.  The  CP  bus  approach  utilizes 
conventional  design  techniques,  but  in  order  to  optimize  the  MS  bus  design 
a simulation  should  be  used.  The  simulation  program  would  allow  an 
efficient  means  of  trading  off  different  designs  and  evaluating  the  effect  of 
the  numerous  variables  involved.  Clearly  though,  the  MS  bus  approach 
offers  significant  advantages  over  a CP  bus  design. 

A store  and  forward  (S&F)  unit  provides  the  same  function,  but  also 
requires  buffer  memory  and  decision  logic.  A comparison  of  these  two 
approaches  is  given  in .Table  3.2-5  and  indicates  that,  for  several  reasons, 
a S&F  unit  is  the  superior  network  element  for  the  design  of  the  multilevel 
bus  tracker  communication  network. 


Global  Control 

Global  control  is  used  because  of  its  suitability  for  control  of  large 
numbers  of  identical  processing  chips  operating  in  parallel  and  performing 
the  same  computation.  Global  control  saves  repetition  of  control  logic  on 
each  of  the  piocessing  chips.  The  global  control  also  can  be  efficiently 
implemented  with  I2L  logic,  possibly  in  the  form  of  3 or  more  parallel  com- 
puters resembling  the  tracking  processors.  Local  control  could  result  in  a 
less  efficient  implementation  of  random  control  logic  gating  or  read  only 
memory  on  CCD  devices  not  well  suited  for  these  functions. 

Global  control  also  provides  an  intelligent  decision  making  capability 
which  may  be  dangerous  to  place  on  processing  chips  not  protected  from 
chip  failure  by  replicated  (e.  g.  , triplicated)  logic.  It  provides  a central 
location  for  global  decision  making,  system  self  test,  fault  tolerant  recon- 
figuration, and  task  (tracker)  assignment. 
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TABLE  3.2-5.  COMPARISON  OF  CROSSPOINT  AND 
STORE  AND  FORWARD  APPROACHES 


iv  i—t  — — * ■ 

Cross  point  Switch 

Store  h.  Forward 

Complete  path  must  be  established 
before  data  can  be  sent. 

Data  is  sequentially  transferred 
from  level  to  level,  thus  only  one 
path  segment  must  be  available  to 
transfer  data  to  the  next  stage. 

A controller  is  needed  to  determine 
the  best  path  of  those  available. 
Control  of  every  crosspoint  switch 
is  necessary. 

Each  unit  is  independent  and  does 
not  require  central  control. 

This  approach  bottlenecks  faster 
with  clustering  of  target  data 
because  complete  dedicated  paths 
are  necessary  for  data  transfer. 
Thus  more  units  per  level  and 
more  connections  are  required  to 
maintain  same  data  rates. 

This  approach  tolerates  greater 
clustering  of  target  data.  Fewer 
units  per  level  fire  thus  required. 

The  basic  unit  is  a hardware  switch 
that  must  be  controlled  by  a central 
control  unit. 

The  basic  unit  requires  more  cir- 
cuitry (however  fewer  are  needed). 
More  flexibility  and  control  are 
provided. 

Nonuniform  location  of  targets  will 
cause  some  time  slots  to  be  over- 
loaded. Thus  a timing  controller 
must  be  provided. 

Overloading  of  time  slots  is  pre- 
vented by  buffering  and  a time 
associative  queue  in  each  unit. 

Input  buffers  for  the  subarrays  are 
required  due  to  the  non-uniform 
distribution  of  targets. 

Subarray  input  buffers  are  not 
required  because  each  store  and 
forward  unit  contains  a buffer 
memory. 

Allows  checking  of  data  at 
destination  only. 

Allows  checking  of  data  at  each 
level  of  transfer  with  retry  capabil- 
ity. :\s  a result  greater  fault 
detection  is  possible. 

Statistical  design  necessary. 

Statistical  design  necessary. 
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3.  3 PROCESSOR  B 


This  section  describes  processor  concept  B,  the  configuration 
produced  by  the  second  of  the  two  independent  teams, 

A PSP  Block  Diagram 

As  shown  in  Figure  3.  3-  1 the  APSP  can  be  considered  to  consist  of  a 
Signal  Processor  followed  by  a Track  Processor.  The  Signal  Processor 
enhances  the  signal  by  detecting  pixels  whose  spatial  and  temporal  character- 
istics indicate  the  presence  of  a possible  target.  Such  pixels  are  referred  to 
as  hits  , which  are  correlated  over  many  time  periods  by  the  Track  Processor. 
The  Track  Processor  creates  files  of  correlated  hits  called  track  files,  ,T'hich 
contain  the  address  and  intensity  of  each  hit.  Completed  track  files  may  be 
transmitted  to  the  ground  link  or  be  further  processed  by  the  spacecraft 
computer. 

Signal  Processor 

As  shown  in  Figure  3.3-2  the  signal  processing  function  is  accom- 
plished by  temporal  and  spatial  filtering,  and  merging  of  the  blur  circle.  At 
each  step  additional  clutter  is  rejected  and  the  data  rate  is  reduced.  The 
MFPA  chip  must  be  clocked  at  varying  rates  such  that  it  is  operating  within 
its  dynamic  range.  The  usable  frame  rates  at  the  MFPA  chip  are  in  the 
range  of  from  10  to  100  frames  per  second.  MFPA  samples  are  then  con- 
verted to  10-bit  digital  words  and  sent  to  the  temporal  filter  for  further 
processing. 


apsp  i 


Figure  3.  3-  1.  APSP  block  diagram. 
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CONTROLS 


Figure  3.  3-2.  Signal  processor  block  diagram. 


Adaptive  Video  Encoder 

As  discussed  in  Section  VII  3.0,  the  temporal  estimator  predicts  the 

intensity  of  each  pixel  in  the  next  frame  based  or  a weighted  sum  ot  past 

sample  values.  The  predicted  values,  are  converted  back  to  analog  form 

and  sent  to  the  MF PA  chip.  On  the  MFPA  chip,  fat  zero  control  is  used  to 

subtract  q. . from  the  present  measured  value,  q. ..  The  result,  Aq. . 

nj  D 

-a  - a is  the  value  sent  to  the  A/D  converter.  The  estimator  is 
Mij  HiJ 

designed  to  predict  slowly  changing  clutter  and  to  have  a very  limitco 
response  to  moving  targets.  Ideally,  a non-zero  Aq^  corresponds  to  the 
intensity  of  a target.  In  addition  the  temporal  estimator  performs  gain 
normalization  to  compensate  for  frame  rate  changes  and  variations  in  the 
responsivity  of  individual  detectors.  The  estimator  produces  data  at  a rate 
equivalent  to  100  frames  /set  . 


Spatial  Filter 

Spatial  filtering  is  performed  on  the  data  from  the  estimator  to 
further  reduce  clutter,  r.nd  is  accomplished  by  comparing  each  pixel  with  its 
neighbors.  Blur  circle  merging  is  closely  related  to  spatial  filtering.  Ihe 
significance  of  blur  circle  merging  is  described  later 
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Signal  Processor  Architecture 


The 


This  section  describes  the  signal  processor  and  related  items, 
two  basic  areas  are  the  Monolithic  Focal  Plane  Array  (MFPA)  and  the  Adap- 
tive Video  Encoder  (AVE).  The  functional  requirements  are  as  follows: 


1. 


3. 

4. 

5. 


An  estimation  of  the  pixel  output  must  be  produced  based  on 
knov/ledge  ot  previous  outputs. 

Because  the  cells  on  the  detector  chips  are  unique  in  their 
response  to  identical  inputs,  their  outputs  must  be  normalize  . 

The  change  in  observed  data-versus-predicted  data  will  be  pre- 
sented to  the  Spatial  Filter  at  a constant  rate  corresponding  to 
a frame  rate  equal  to  or  less  than  100  F/s  (Frames  per  second). 

The  signal  processor  must  suppress  phenomena  which  cause  any 
of  the  detector  cells  to  saturate.  (Reference  Section  VII  3.0). 


'['he  signal  processor  must  acknowledge  a 
cell  or  cells  and  modify  the  frame  rate  to 
saturation  such  as  laser  countermeasures 
Section  VII  3.0.) 


valid  saturation  of  a 
remove  the  cells  from 
. (Reference 


The  relationship  of  ihese  functions  is  shown  in  I igurc  3.3  3. 


Estimation 

The  estimator  is  a temporal  filter  which  makes  predictions  based 
upon  the  past  history  of  a cell.  This  could  be  performed  in  a number  of 
ways.  The  scheme  chosen  uses  a finite  number  of  past  values,  along  with 
weighting  coefficients,  to  obtain  a prediction  that  will  best  locate  targets  at 
their  earliest  appearance.  The  estimation  in  weighted  sum  form  is: 


$t  <i.j>  = 

n+1 


m 

2 

k=o 


ak  qt 


n-k 


(i,  j) 


where  (i,  j)  are  the  coordinates  of  the  cell  on  a chip  and  m is  chosen  arbi- 
trarily to  be  4.  (Reference  Section  VII  3.2  on  the  programmable  predictor.  ) 
The  weights,  or  gains,  a.,  will  be  supplied  upon  further  analysis  of  this  type 
of  estimation  scheme. 


TO:  D/A 


Figure  3.  3-3.  Temporal  filter. 


The  estimate  is  then  passed  through  a D/A  converter  and  compared 

to  q,  (i,j).  The  difference  (positive  or  negative)  is  then  digitize 

addeSVo  the  estimate  to  v eld  the  digital  value  of  qtn+r 

Hardware  Implementation 

the  maximum  data  rate  must  he  determined.  On 
a per  chip  basis:  each  chip  has  (1Z8)(12»  = ^ 384  cells,  and  the  assutn=d 

P / ( , oof/s  Thus  a maximum  data  rate  of  (100)06.384) 

highest  frame  rate  is  100  F/s.  corresponds  to 

1.6  MHz  is  possible.  The  maximum  data  i ate 

, OOF/s  From  this,  it  is  seen  that  Aq  samples  arrive  every 

seconds.  No  technology  that  will  allow  five  multiplications  and  the  summing 
of 'five  items  to  be  performed  at  these  speeds  is  foreseeabie.  Hence,  a 
pipeline  or  parallel  implementation  is  considered. 


* » 


n 


i i 


In  this  instance,  a parallel  scheme  will  be  faster  and  utilize  less 
hardware  than  a pipeline  scheme.  One  implementation  is  shown  in  fig- 
ure 3.  3-4  and  operates  in  the  following  manner. 

Step  1)  An  estimate  is  obtained  from  ^EQ  (Estimate  Queue  — a CCD 
memory)  and  shifted  into  the  q register. 

Step  2)  The  value  in  q is  sent  to  be  compared  with  q from  the  MFPA. 

Step  3)  The  difference,  Aq,  is  returned  from  the  A/D  and  added  to 

the  estimate  to  obtain  the  actual  value,  qQ.  This  qQ  is  placed 
into  two  registers. 

Step  4)  a.  qQ  is  multiplied  by  its  associated  gain  and  temporarily 
stored. 

b,  q0  is  shifted  into  the  qt-1  queue.  'I  he  qj  being 
shifted  out  goes  into  a register  and  also  into  the 
qt  ; queue.  The  q^  being  shifted  out  also  goes 


Figure  3.  3-4.  Mechanization  of  estimator. 


3-29 


into  a register  and  into  the  q^.3  queue.  The  values  of  q^ 
and  q^  are  handled  in  the  same  manner  except  that  q^  is 
not  retained  for  future  use. 

Step  5)  Once  these  values  of  qL  are  set  up  in  their  corresponding 

registers,  four  multiplications  take  place  which  produce  the 
four  quantities,  individually:  aj  j,  a^q^,  a3cl3 » a-4c\4'  The 

coefficients  an  are  obtained  from  a ROM.  (Reference 
Section  VII  3.0.) 

Step  ()  After  the  above  products  are  formed,  they  are  summed 
along  with  a0q0  to  yield  the  next  estimate  - q. 

Step  7)  This  estimate  is  now  placed  in  the  EQ  to  be  used  after  the 
remaining  16K-  1 cells  have  to  be  processed. 

Notice  that  increasing  the  number  of  past  values  to  use  in  the  esti- 
mate is  easily  done  by  replicating  the  last  cell  as  many  times  as  needed. 

Gain  Normalization 


Gain  normalization  is  performed  such  that  to  units  beyond  the  temporal 
estimator,  MFPA  data  appears  to  be  uniform.  One  technique  is  to  exercise 
the  MFPA  after  its  construction  to  determine  which  cell  on  the  entire  array 
yields  the  mean  output  for  a constant  input,  d his  cell  will  then  have  a 
normalization  factor  of  unity.  The  remaining  cells  will  have  normalization 
factors  different  from  unity  to  adjust  their  outputs  to  correspond  to  the 
weakest  cell.  However,  it  is  m •.  .y  that  these  normalization  constants  will 
need  to  be  modified  due  to  changes  on  the  MI  PA  during  its  life. 

Since  the  quantity  to  be  ormalized  is  Aq,  and  Aq  is  primarily  a 
function  of  the  estimator,  the  Aq's  will  be  directly  multiplied  by  their 
corresponding  normalization  constants. 

This  implementation,  along  with  the  ability  to  update  the  constants, 
is  shown  in  Figure  3.  3-b.  The  operation  is  as  follows: 

Step  1)  While  the  Aq  is  being  digitized  in  the  A/D,  its  corresponding 
normalization  constant  is  being  shifted  into  a register  and 
back  into  the  CCD  memory. 

Step  2)  These  two  values , Aqpand  nAq,  are  then  transferred  to 
two  other  registers  to  allow  the  next  AqD  to  immediately 
follow. 

AqD  and  n are  multiplied  to  yield  AqDN,  a normalized 
value . 


Step  3) 


GAIN  NORMALIZATION 


I 


I 


• CCD  MEMORY; 

(16  K CONSTANTS)  (8  8 ITS/CONSTANT)  ( T.&B1L  ) ' 8 CHIPS 

16  K ol  I 

8 CHIPS 

Figure  3.  3-5.  Gain  n o rmalization. 

It  should  be  noted  if  the  MFPA  cell  with  the  greatest  output  for  a con- 
stant input  were  given  a normalization  constant  of  unity,  the  weake  st  cell  would 
then  have  the  largest  constant  associated  with  it,  and  the  multiplication  could 
possibly  yield  a number  greater  than  8 bits  in  length. 

Along  with  "normal11  operation,  the  updating  of  constants  is  performed 
via  the  mux  "on  top"  of  the  CCD  memory.  At  the  appropriate  time,  the  select 
line  is  changed  and  the  new  constant  shifted  in  to  replace  the  previous  constant 

Output 

The  temporal  filter  output,  Aq,  (i,  j)  is  to  be  presented  to  the  Spatial 
Filter  at  a constant  rate  corresponding  to  the  fixed  frame  rate. 
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Since  the  CCD  detector  chips  are  assumed  to  operate  linearly  (i.e., 
at  an  interval  t a bucket  will  accumulate  x photons  and  at  t/  2 it  will  accu- 
mulate x/2  photons),  using  frame  rates  greater  than  10  that  correspond 
to  powers  of  2 greatly  simplifies  the  output  problem.  Thus  we  obtain: 


Frame  Rate  (F/s) 

Multiplier  for  Aq 

10 

1 

20 

2 

40 

4 

80 

8 

To  better  understand  this  simple  scheme,  let  us  examine  the  Aq's  at 
higher  frame  rates.  First  of  all,  Aq  may  be  positive  or  negative  coming  into 
the  A/D.  This  implies  that  the  estimator  overshoots  or  undershoots  the 
actual  q of  any  detector  cell.  Over  many  frames  the  number  of  overshoots 
will  equal  the  number  of  undershoots  implying  that  the  Aq's  over  this  range 
will  add  to  zero,  which  is  what  is  desired  in  the  absence  of  targets.  Thus, 
at  higher  frame  rates,  the  probability  of  the  sum.  of  Aq's  equaling  zero  is 

However,  in  the  worst  case,  this  sum  could  equal  (approximately) 
nAq,  where  n is  given  by: 


n 


HIGHER  FRAME  RATE 
” 10 


Beyond  this,  given  an  effective  estimator,  nAq  (to  a limit  on  n)  should  be 
„ch  less  than  a target  value.  Therefore,  taking  one  of  the  n estimates  an 
multiplying  it  by  n will  yield  a worst  case  difference  but  also  cut  the  hard- 
ware to  a minimum.  Lastly,  using  n as  a power  of  2 allows  the  normalised 
Aq  to  shifted  by  1,  2 or  3 bits  to  affect  the  multiplication  by  2,  4 or  8, 
respectively. 

The  implementation  follows  easily.  Since  we  are  assuming  4 frame 
rates,  ali  powers  of  2,  we  need  only  one  register  to  hold  AqDN  and  a 4:1 
mux  to  select  the  correct  shift,  as  shown  m Figure  3.  3-6. 
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Figure  3.3-6.  Temporal  filter  output  scaling. 


Two  points  should  be  noticed:  1>  2 inputs  to  AqDNC  register  and 

2)  the  mux  select. 


(1)  Since  data  can  enter  at  rates  210  F/s,  (Frames  per  second) 
the  qpiq  register  needs  2 inputs;  one  from  the  normalization 
process  directly  for  the  10  F/s  rate,  and  one  from  a queue  in 
which  normalized  values  are  placed  for  rate  >10  F/s. 

The  need  for  a queue  results  from  the  mechanization  of 
handling  the  higher  frame  rates.  Since  a Aq  is  selected  and 
multiplied  by  a power  of  2 to  obtain  the  final  Aq,  the  remaining 
N-l  samples  are  ignored.  However,  since  the  first  set  of  sam- 
ples is  presented  to  the  normalization  process  at  rates  >10  F/s, 
the  normalized  values  need  to  be  stored  so  that  they  can  be  pre- 
sented to  the  Spatial  Filter  a!  a rate  corresponding  to  10  F/s. 

(2)  The  mux  select  to  provide  the  Spatial  Filter  with  the  properly 
scale  data  is  the  same  select  which  operates  the  select  on  the 
clock  mux  for  the  MFPA  described  later. 

Due  to  the  effect  of  data  entering  the  signal  processor  at  >10  F /s 
and  the  manner  in  which  this  data  is  handled,  it  must  also  be  equipped  with 
logic  to  remember  how  many  next  Aq's  to  ignore.  This  is  implemented  with 
simple  counters. 


1 


1 

•| 


| 

a 

i 
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Impulse  Noise  Suppression 

A problem  arises  when  a cell  in  the  MFPA  saturates:  is  it  caused  by 
a target  or  was  the  cell  ionized  due  to  cosmic  effects?  At  the  highest  track 
ing  rate,  100  F/s,  a target  would  have  to  have  to  be  moving  1 mile  in 
10  ms,  or  0.  3b  million  miles  per  hour.  Thus  we  can  conclude  the  following: 

CELL  (i,  j) 


Time 


1 i 
n-  I 


n 


Value 


n+1 


Nominal 

Saturated 

Nominal 


From  this  we  see  that  a cell  saturating  for  one  frame  time  is  caused 
by  something  other  than  a target.  This  could  be  handled  as  follows:  on  a 

per  detector  chip  basis,  when  one  of  cells  is  sensed  to  saturate,  flag  the 
signal  processor.  The  signal  processor  then  takes  no  action  save  remem- 
bering that  a saturation  took  place.  At  the  next  frame  time  the  signal 
processor  looks  for  a saturation  signal.  If  none  arrives,  the  fact  of  she  pre- 
vious saturation  is  forgotten  and  processing  continues  normally.  However 
if  another  saturation  signal  arrives  a target  could  be  present  and  hence  the 
normal  saturation  control  procedure  is  involved. 

The  implementation  is  a simple  logic  circuit  that  looks  fo-  2 
consecutive  saturation  pulses  from  an  MFPA  chip  as  shown  in  F:gure  3.3-7. 


MFPA  CHIP 
SATURATION 


~\ 


POSSIBLE  TARGET  PRESENT 


Figure  3,3-7.  Impulse  Noise  Detection. 
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The  circuit  operation  is  trivial.  A pulse  arrives  at  time  tfi  and  is 
remembered  in  the  D F/F.  The  line  x goes  active  only  if  another  saturation 
pulse  arrives  on  the  following  clock,  otherwise  the  first  saturation  detect  is 
lost. 


Saturation  Detection  and  Control 

As  was  seen  in  the  previous  paragraph  only  2 time-wise  continuous 
saturation  signals  from  an  MFPA  chip  will  cause  any  action  to  he  taken.  The 
obvious  action  is  to  step  up  the  frame  rate  of  the  MFPA.  As  was  seen  Dre- 
viuusly,  the  ideal  frame  rates  are  powers  of  2;  10,  20,  40  and  80  F/s. 

Thus,  the  following  approach  can  be  used.  If  a valid  saturation  occurs, 
select  the  next  higher  frame  rate.  This  will  caus  _ one  of  the  following. 


10  — 

► 20 

F/s 

20  — 

► 40 

F/s 

40  — 

► 80 

F/s 

with  valid  targets  being  incapable  of  saturating  a cell  in  the  MFPA  at 
80  F/s. 

However  the  reverse  situation  also  exists:  the  MFPA  runmng 

too  fast  and  needing  to  be  slowed.  The  saturation  detection  circuit  can  be 
used  for  this  also.  By  noting  that  a frame  rate  greater  than  100  F/s  is  cur- 
rently being  exercised  ar.d  that  no  valid  saturations  occur,  the  frame  rate 
can  be  reduced  (by  reversing  the  arrows  in  the  above  table). 

The  implementation  is  straightforward  and  a simple  technique  is 
pictured  in  Figure  3.  3-8.  The  saturation  controller  looks  for  a valid 
saturation  signal  from  the  impulse  suppressor  and  utilizes  the  following 
logic:  if  a signal  is  present,  step  up  the  frame  rate;  if  it  is  not,  possibly 
step  down  the  frame  rate.  The  reason  for  possibly  stepping  down  the 
frame  rate  is  to  prevent  a thrashing  type  of  operation  between  a frame 
rate  that  causes  valid  saturations  and  one  that  doesn't. 

Thus  valid  saturations  speed  up  the  MFPA  (and  select  the  appropri- 
ately scaled  output),  and  the  absence  of  saturations  will  eventually  slow 
down  the  MFPA. 
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SATURATION  CONTROL 


Figure  3.  3-8.  Saturation  control  mechanization. 


Spatial  Filter 

The  purpose  of  the  Spatial  Filter  is  to  determine  the  locations  of 
pixels  which  are  illuminated  by  targets.  A functional  block  diagram  is 
shown  in  Figure  3.  3-9.  The  unit  receives  sequential  Aq  values  for  each 
pixel  from  the  Temporal  Filter  and  after  processing  this  data  reports  pixel 
'hits"  to  the  Trackers.  The  objective  is  to  report  the  address  of  the  single 
pixel  which  most  closely  repress  a target's  position.  Additionally  the 
difference  m amplitude  between  the  target  pixel  and  the  average  of  the  adja- 
cent pixels  is  reported  along  with  the  pixel  address.  To  accomplish  the 
above  function  the  following  two  processes  are  employed: 

1.  Fo ar-direction  adjacent  pixel  comparison 

Z.  Blur-circle  merging 

Adjacent  Pixel  Comparison.  The  adjacent  pixel  comparison  process 
is  illustrated  in  Figure  3.  3-10.  A three  by  three  window  centered  about  the 
candidate  pixel  is  used  to  detect  the  presence  of  a target.  The  amplitude  of 
a pixel's  illumination  is  given  by  A.  ..  If  the  magnitude  of  the  candidate  pixel 
(A  ) is  greater  than  the  average  magnitude  in  all  four  directions,  it  is 

reported  as  a "hit.  " 
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PIXEL  Array 
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Add  amplitude  and  pixel  address  to  Hit  file. 

Figure  3.  3-  10.  Four  direction  adjacent 
pixel  comparison. 


Two  hit  file  buffer  memories  are  provided  to  allow  blur-circle 
erging  to  be  performed  concurrently  with  adjacent  pixel  comparison. 
Lur-circle  merging  is  performed  on  the  hit  file  generated  during  the  nth 
:an  cycle,  stored  in  one  buffer,  while  the  hit  file  generated  during  the 
L + l)_st  cycle  is  being  scored  in  the  other  buffer. 

Blur  Circle  Merging.  A point  target  will  appear  on  the  MFFA 
urred  as  a circle.  The  size  of  the  pixels  has  been  selected  to  be  equal  to 
,e  blur  circle  for  a point  target,  for  optimum  signal-to-noise  ratio  purposes 
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Because  pixels  are  of  the  same  size  as  blur  circles,  a target  will 
almost  always  be  seen  in  more  than  one  pixel  at  a time. 


c 

common 


very  common 


The  adjacent  pixel  comparison  process  merges  some  instances  of 
multiple  illumination.  However,  as  the  blur  circle  center  approaches  pixel 
boundaries  the  adjacent  pixel  comparison  process  is  unable  to  perform  the 
merging  and  repor's  more  than  one  hit.  Therefore  further  processing  of  the 
hits  is  required.  The  adjacent  pixel  comparator  loads  the  hit  file  buffer  with 
one  data  item  for  each  hit  recorded.  The  blur  circle  merging  algorithm  iden- 
tifies and  examines  clusters  of  hits.  Using  intensity  and  the  shape  of  t">e 
cluster  as  deciding  criteria,  each  clutter  is  merged  into  a single  hit. 

Tracking  Processor 

From  a data  processing  point  of  view,  tracking  in  the  APSP  consists 
of  sorting  the  continuously  incoming  hit  reports  into  track  files  and  discard- 
ing those  hits  which  do  not  appear  to  belong  to  any  track. 

The  tracking  function  can  be  subdivided  into  the  following  tasks  and 
subtasks  : 

1.  Track  initiation:  a.  Recognizing  potential  target 

b.  Determining  that  it  is  not  part 
of  any  track 

c.  Initiating  a microprocessor 


2.  Monitoring  of  a track:  a.  Update  state  vector  at  each  frame 

b.  Handle  special  conditions: 

Crossing  of  chip  boundaries 
Missed  measurements 
Bifurcations 

c.  Produce  track  file 

3.  Ending  a track:  a.  Monitor  kinetic  propertie s of 

tracks 

b.  Identify  clutter 

c.  Count  missed  measurements 

d.  Terminate  the  tracking  if: 

The  track  is  clutter,  or 

There  are  too  many  consecu- 
tive missed  measurements 

e.  Transmit  track  file  to  ground  link. 


Given  the  estimated  number  of  hits  per  frame  that  need  to  be  pro- 
cessed and  the  short  time  in  which  this  processing  has  to  be  done,  it  is 
clear  that  seme  sort  of  parallel  processing  is  required.  In  order  to  use 
simple,  jOw  power,  identical  processing  elements  and  to  provide  the  neces- 
sary computing  power,  array  processing  is  best  suited.  Each  processing 
element  in  the  array  has  to  be  assigned  a portion  of  the  tracking  task.  This 
assignment,  which  is  always  the  central  issue  in  the  design  of  array  pro- 
cessors, is  generally  referred  to  as  the  resource  allocation  problem. 

Another  issue  generally  encountered  in  array  processors,  and  one 
particularly  acute  in  this  application,  is  that  of  dataflow.  Therefore  solu- 
tions of  the  resource  allocation  and  the  data  flow  problems  are  going  to 
characterize  the  design  of  the  array  processor. 


The  One  Track  per  Processing  Element  Approach 

This  design  enables  the  hits  detected  by  the  spatial  filter  to  be 
broadcast  to  all  processing  elements  over  a bus.  Each  track  currently  being 
monitored  has  one  processing  element  assigned  to  .t.  When  the  spatial  filter 
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is  broadcasting  hits,  each  micro-processor  looks  only  for  hits  falling  within 
the  tracking  gate  of  the  track  it  is  monitoring.  As  consecutive  hits  falling 
into  the  tracking  gate  are  acquired,  a track  file  containing  all  the  past  his- 
tory information  of  that  track  is  synthesized. 

Each  processing  element  acquires  from  the  bus  only  information 
pertaining  to  the  immediate  vicinity  of  the  target.  Most  of  the  time  this 
amount  of  information  is  sufficient  to  continue  me  tracking  process.  At 
times,  however,  global  information  is  needed.  A special  processor,  called 
the  supervisor,  is  used  for  this  purpose. 

The  supervisor  determines  which  hits  in  each  frame  were  not  picked 
up  by  any  processing  element.  All  such  hits  are  potential  new  tracks  and  an 
idle  processing  element  must  bo  assigned  to  each  by  the  supervisor. 

Whenever  a processing  element  finds  more  than  one  hit  within  the 
tracking  gate  it,  must  determine  whether  this  is  a case  of  a track  crossing 
a bifurcation,  or  a new  track  appearing.  Since  this  decision  requires  global 
information,  it  will  have  to  be  made  by  the  supervisor. 

Whenever  a processing  element  decides  that  the  track  it  was  moni- 
toring has  to  be  ended  it  notifies  the  supervisor  which  then  deactivates  that 
processing  element  and  marks  it  as  available. 

Figure  3.3-11  shows  the  configuration  of  a processing  element  in  the 
array.  Data  is  drawn  from  the  Hit-Bus.  Among  other  things,  each  data 
item  on  the  Hit-Bus  contains  sequentially  the  (i,  j)  coordinate  pair  of  the  hit. 

The  Mousetrap  is  a programmable  hardware  device  which  when  pro- 
vided with  i ■ ,i  ,j  , j will  acquire  all  those  hits  from  thp  bus 
whose  coordinates  (i,  j)  fail  within  that  rectangle.  The  data  for  such  hits  is 

passed  to  the  microprocessor.  When  data  items  with  i > i appear  on 

max  r 

the  bus,  the  Mousetrap  notifies  the  microprocessor  that  no  further  hits  will 
appear  The  microprocessor  then  processes  the  hits.  Hits  falling  within  the 
gate  will  be  acknowledged  to  the  supervisor.  Tracking  information  will  be 
pushed  onto  the  Track  File  Queue.  A new  gate  is  computed  and  the 
Mousetrap  is  programmed  accordingly.  Thereafter  the  microprocessor 
waits  for  new  hits  to  be  transmitted  by  the  Mousetrap. 


3-41 


LOAD 

Figure  3.3-11.  Processing  element  configuration. 

The  Track  File  Queue  is  a serial  memory  where  the  microprocessor 
pushes  data  in  from  one  end  while  the  ground  link  reads  data  off  the  other 
end.  Data  is  always  moving  through  this  memory  as  through  a pipeline. 

The  Program  Memory  can  be  loaded  with  programs  and  constants 
from  an  external  system.  Once  loaded,  it  determines  the  behavior  of  the 
microprocessor.  From  the  point  of  view  of  the  microprocessor,  this  is  a 
read-only  memory. 

The  Scratchpad  is  a relatively  small  memory  containing  all  variables 
used  for  tracking  in  this  processing  element.  The  foregoing  discussion  indi- 
cated that  each  active  processing  element  processed  only  one  track.  In  fact 
one  microprocessor  contains  multiple  track  files,  as  described  in  Section  5 
(Software)  of  this  report. 

This  design  has  two  important  weak  points:  the  supervisor  appears 
to  be  very  complex  and  the  data  rate  on  the  bus  is  very  high. 


I 
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The  data  rate  on  the  bus  can  be  reduced  by  dividing  the  focal  plane 
into  a number  of  overlapping  sections  and  assigning  a separate  bus  to  each. 

Each  mousetrap  is  then  preceded  by  a multiplexor  which  selects  one 
of  the  buses.  The  selection  is  determined  by  the  microprocessor  based  on 
the  section  in  which  the  gate  is  located.  The  data  rate  is  cut  down  by  a factor 
(approximately)  equal  to  the  number  of  buses  employed.  Thus  the  data  rate 
on  the  bus  can  freely  be  traded  off  for  added  hardware. 

Track  Initiation  Hardware 

Hits  reported  by  the  spatial  filter  can  be  grouped  into  three 
categories : 

a.  The  next  position  of  the  target  track 

b.  Clutter  that  appears  to  be  the  continuation  of  a false  track  left  by 
clutter 

c.  Hits  that  do  not  appear  to  be  the  continuation  of  any  track. 

The  latter  kind  of  clutter  will  cause  a large  number  of  tracks  to  be  initiated 
and  terminated  at  each  frame  time.  This  function  represents  the  largest 
computational  load  on  the  supervisor.  In  order  to  accomplish  these  functions 
the  supervisory  task  must  be  divided  into  several  independent  functions. 

Such  a division  allows  for  parallel  processing  at  the  supervisor  level. 

Figure  3.  3-12  shows  the  hardware  configuration  for  the  track  initia- 
tion and  deletion  functions.  The  control  function  is  distributed  through- 
out the  array  of  mousetraps.  Only  two  subfunc..ions  are  performed  on  a 
global  basis:  mousetrap  distribution  over  the  four  buses  and  mousetrap 
chaining  to  determine  the  sequence  in  which  mousetraps  are  assigned  to  hits. 

When  a track  is  terminated,  the  associated  mousetrap  is  deactivated 
and  placed  at  the  end  of  the  queue  of  idle  mousetraps  waiting  for  new  hits. 

This  process  is  performed  in  two  steps:  (1)  determine  which  bus  to  monitor 

and  (2)  determine  the  number  of  mousetraps  in  the  mousetrap  queue.  The 
first  step  is  performed  by  a special  hardware  device  which  contains  four 
counters  containing  the  current  number  of  idle  mousetraps  associated  with 


Figure  3.3-1Z,  Track  initiation  and  deletion  hardware. 

each  bus  and  a logic  unit  monitoring  the  four  counters  and  determining  which 
bus  has  the  least  number  of  idle  mousetraps.  The  output  of  this  unit  is  used 
by  a mousetrap,  when  it  becomes  idle,  to  determine  which  bus  it  should 
monitor.  Thus  the  mousetraps  are  uniformly  distributed  over  the  four  buses. 

Each  mousetrap  contains  a counter  which  indicates  its  position  in  the 
mousetrap  queue.  One  additional  counter  is  used  to  indicate  the  number  of 
the  next  empty  position  at  the  end  of  the  queue.  When  a mousetrap  is  in  the 
process  of  becoming  idle  it  transfers  tne  contents  of  this  special  counter  into 
its  own  position  counter  and  increments  the  special  counter  by  one.  Mouse- 
traps are  assigned  to  new  hits  in  the  following  fashion: 

1.  If  a hit  is  picked  up  by  an  active  mousetrap  this  action  results  in 
a pulse  appearing  on  the  hit  taken  bus. 

Z.  If  a hit  is  not  picked  up  by  any  mousetrap  the  hit  is  assigned  to 
the  mousetrap  at  the  head  of  the  queue  and  all  queue  position 
counters  are  decremented  by  one  including  the  end  of  queue 
counter. 

3.  When  a mousetrap  position  counter  equals  1 it  places  itself  in  the 
ready  state.  In  this  state  all  hits  are  picked  up  from  the  bus 
until  a hit  is  assigned  to  this  mousetrap  as  described  in  Z above. 
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4. 


When  a new  hit  is  assigned  to  a mousetrap  the  i and  j values  are 
placed  in  the  gate  boundary  register.  These  registers  are 
incremented  and  decremented  appropriately  to  form  a standard 
9 pixel  gate  for  the  next  frame. 

It  should  be  noted  that  the  state  of  the  mousetrap  associated  with  each 
processing  element  determines  the  state  of  the  microprocessor.  If  the 
microprocessor  is  available,  then  the  mousetrap  is  in  the  idle  queue.  When 
the  mousetrap  becomes  active  and  picks  up  a hit,  the  microprocessor  is  ini- 
tialized to  start  a track. 

The  One  MFPA  Chip  per  Processing  Element  Approach 

With  this  design  approach  each  MFP.A  chip  has  a dedicated  process- 
ing element.  That  processing  element  monitors  all  hits  and  tracks  within  the 
MFPA  chip.  The  Hit-Bus  is  thus  eliminated  and  the  resource  allocation 
problem  is  solved  a priori. 

As  shown  in  Figure  3.  3-13  a bus  is  used  to  move  the  processed 
track  file  data  to  the  respective  memory.  To  reduce  traffic  on  this  bus, 
processing  elements  will  not  transmit  track  data  pertaining  to  new  tracks 
until  the  track  has  at  least  10  valid  entries.  Track  file  data  items  are 
routed  to  the  respective  track  file  memory  by  means  of  an  ID-number  unique 
to  that  track.  The  assignment  ofID-numbers  and  track  file  memories  to 
tracks  is  still  done  by  a supervisor  but,  since  only  tracks  older  than 
10  frames  receive  such  allocation,  supervisor  traffic  is  much  lower. 

This  approach  does  not  seem  very  promising  because  it  does  not 
offer  dynamic  allocation  of  processing  elements  and  because  it  seems  very 
doubtful  whether  fast  enough  processing  elements  will  be  available  to  process 
all  the  tracks  that  could  appear  on  an  MFPA  chip  in  one  frame  time. 

Tracking  Algorithms 

In  the  designs  presented,  the  tracking  function  is  partially  implemented 
in  hardware  and  partially  in  software.  For  example  the  detection  of  hits  on 
the  Hit-Bus  that  fall  within  the  current  gate  is  a hardware  function  performed 
by  the  mousetrap.  Computing  the  velocity  and  acceleration  of  the  target  is 
done  by  executing  instructions  from  the  Program  Memory  in  the  microprocessor 
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Figure  3.  3-13.  The  one  MFPA  chip  per  processing  element  approach. 

and  is  therefore  a software  function.  Thus  the  term  "tracker"  as  used  in 
this  discussion  means  a software  algorithm. 

The  tracking  software  can  be  viewed  as  consisting  of  two  independent 
functions.  The  purpose  of  one  is  to  determine  the  search  gate,  set  up  the 
mousetrap  and  retrieve  the  hits  from  the  mousetrap.  It  also  services  the 
Track  File  Memory. 

The  other  software  module  is  much  more  complex.  It  performs  the 
following: 

Rejection  of  clutter  tracks 

Selects  hit  when  more  than  one  appears  in  gate 

Monitors  intensity 

Any  other  functions  desired. 
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Frame  Rate 


It  is  desirable  to  adjust  the  frame  rate  in  such  a way  that  even  for  the 
fastest  moving  targets  (i^+  ^ , Jt+^)  is  an  immediate  neighbor  of  (i^,  j ).  In 
other  words  the  target  moves  at  most  one  pixel  on  the  grid  during  each 
frame.  Therefore  the  frame  rate  will  have  to  be  adjusted  as  a function  of  the 
size  of  a pixel.  The  apparent  velocity  of  the  fastest  expected  moving  target 
(in  pixels/sec)  is  a parameter  in  the  frame  rate  adjustment. 

Based  on  the  requirement  that  targets  with  apparent  velocities  from 
0 to  70  pixels/sec  have  to  be  tracked,  a range  of  fr-rne  rates  between 
10  and  80  Hz  appears  adequate. 

Frame  rates  must  be  adjustable  because  too  low  a rate  leads  to  large 
gate  sizes,  whereas  too  high  a rate  will  cau  se  excessive  amounts  of  redun- 
dant tracking  data  to  be  output. 

Tracking  Algorithm  Requirements 

Due  to  the  adjustable  frame  rate,  finding  the  target  in  the  next  frame 
is  a relatively  easy  task. 

Tracking  will  be  performed  based  on  the  physical  laws  that  govern 
accelerated  motion.  By  monitoring  velocity  and  acceleration  in  the  state 
vector  for  a track  it  is  possible  to  discriminate  between  targets  and 
clutter.  When  clutter  is  monitored  over  a period  of  time  it  is  probable 
that  at  some  points  its  velocity  will  exceed  the  maximum  expected  velocity  of 
a target  or  that  its  acceleration  will  surpass  a set  limit  (e.  g.  , 5 g's). 

The  tracking  algorithm  does  not  have  to  predict  the  position  of  the 
'arget  in  the  next  frame.  That  can  be  done  by  simply  searching  the  position 
of  the  target  in  the  previous  frame  and  the  8 immediately  adjacent  pixels. 

Unlike  a-(3  filt . rs  and  Kalman  filters  which  track  the  target  by  com- 
puting a weighted  sum  of  its  predicted  and  mea.sured  positions,  tracking  in 
APSP  is  based  solely  on  measured  positions.  The  reason  why  such  a simple 
algorithm  can  be  employed  is  that  accurate  and  frequent  measurements  are 
available. 
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The  Effect  of  Gaps  Between  Chips  of  the  MFPA 

The  frame  rate  will  always  be  adjusted  as  a function  of  the  size  of  the 
footprint  of  a pixel  in  such  a way  as  to  make  the  radius  of  the  gate  equal  to 
one  pixel  or  less  for  all  manmade  flying  objects. 

At  the  points  v/here  MFPA  chips  are  joined  a number  of  pixels  am 
missing.  For  this  reason  the  gate  size  will  have  to  be  increased  at  that 
point.  The  relation  between  gate  radius  (R)  and  gap  width  (G)  is 

R = G + 1 

Basically  the  area  of  the  gate  is 

A = tt  (G  + l)2 

Table  3.3-1  shows  actual  gate  size  (obtained  by  counting  pixels)  as  a function 
of  gap  size. 


TABLE  3.  3-1. 


Maximum 
Gate  Radius 


Maximum 
Actual 
Gate  Area 


Much  of  the  gate  area  can  be  eliminated  because  the  target  could  only 
reach  certain  points  within  the  circle  (gate)  if  its  acceleration  was  very  high 
(e.g.,  greater  than  5 g's).  It  cannot  be  stated  how  much  of  the  gate  circle 
can  be  eliminated  without  knowing  the  footprint  of  a pixel. 

Another  way  to  cut  down  on  gate  area  is  to  relate  it  to  the  speed  of 
the  target.  The  above  relation  for  radius  of  the  gate  referred  to  a target 
moving  at  maximum  speed  (e.  g.  , 70  pixels /second ).  If  the  target  in  fact 
moves  slower,  the  gate  radius  can  be  decreased  accordingly. 
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3,4  CONSOLIDATED  ARCHITECTURE 

After  detailed  examination  of  the  two  proposed  architectures,  the 
basic  philosophy  of  the  first  was  used  and  refined.  It  is  this  concept  which 
is  now  explained.  Figure  3.5-1  is  a functional  partitioning  of  the  APSP. 

The  following  is  a discussion  of  the  tradeoffs  and  selection  cc  the  temporal 
filter,  spatial  filter,  detection  logic  and  the  track  processor. 

Temporal  Filter 

The  filtering  provided  by  the  temporal  detection  filter  (TDF)  rejects 
slowly  moving  or  stationary  clutter  edges  while  passing  moving  targets. 

The  system  performance  is  provided  by  the  filter  noise  equivalent  bandwidth 

and  clutter  rejection  curves. 

The  filter  design  philosophy  is  to  provide  target  detection  on  a per 
pixel  basis  via  the  hardwired  TDF.  The  trackers  use  this  information  to 
generate  track  information.  An  adaptive  temporal  filter  (ATF)  is  also  used 
to  provide  refined  temporal  filter  algorithms.  For  example,  an  accelerating 
target  requires  an  ATF  which  maintains  frequency  "lock-on"  as  the  target 
velocity  changes. 

Prior  ‘")  filtering  by  the  temporal  filter  processor  the  target  has 
been  filtered  by  the  optics  and  by  the  detector  geometry.  Both  filters  are 
low -pass  filters.  The  corresponding  blue  size  is  approximately  equal  to  the 

detector  element  size. 
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Figure  3.4  -1.  Functional  units  of  APSP. 
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As  shown  in  Section  6.  2,  the  temporal  detection  filter  needs  >o  have 
at  least  third-order  difference  filteiv.ig  capability  in  order  to  discriminate 
effectively  against  moving  cloud  edges.  Higher  order  filters  provide  little 
performance  improvement  for  the  increased  cost. 

In  addition,  an  effective  variable  frame  time  is  required  at  the  front 
end  of  the  temporal  filter  to  provide  an  optimum  match  to  targets  of  varying 
velocity. 

Figur ° 3.4-2  shows  the  temporal  filter  functional  block  diagram.  The 
digital  signals  from  the  AVE  enter  first  an  N frame  accumulator!  N = 2,  4, 

8,  16.  The  purpose  of  the  integrator  is  to  provide  a match  to  a range  of 
target  velocities  for  a constant  detector /mux  array  frame  rate,  nominally 
10  Hz.  For  5 pixels  per  second  target  rate,  the  10  Hz  frame  rate  is  optimum 
On  the  other  extreme,  for  0.  3 pixels  per  second,  a frame  rate  of  0.6  Hz  is 
optimum,  hence  N = 16  frames  of  integration  is  required  in  the  latter  case. 

The  third  order  difference  filter  in  a transversal  implementation  is 
also  shown  in  Figure  3.4-2.  Three  frames  of  memory,  160K  words  each, 
and  four  multipliers  and  an  adder  are  used  in  this  implementation.  A recir- 
culating recursive  filter  uses  less  memory  but  it  may  have  high  round  off 
errors . 
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Figure  3,4-2.  Temporal  discrimination  filter  block  diagram. 
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Adaptive  Temporal  Filter 

Targets  that  have  been  acquired  by  the  tracker  will  be  further 
processed  in  the  Adaptive  Temporal  Filter  (ATF).  The  principal  utility  of 
this  processor  is  to  increase  SNR,  clutter  rejection  and  target  location 
capability.  A priori  information  about  the  target  location  and  predicted  state  is 
furnished  by  the  tracker  to  the  ATF.  This  results  in  the  detrctor/mux  array 
being  partitioned  into  smaller  arrays  around  the  present  target  pixel  position. 
Software  filtering  algorithms  can  be  utilized  in  this  small  subarray,  resulting 
in  enhanced  svstem  performance  while  properly  allocating  the  track  processor 
resources.  A preliminary  algorithm  is  discussed  in  Section  5.2. 

Star  Discrimination  Filter 

The  sensor  which  looks  above  the  horizon  (ATH)  has  a high  density  of 
moving  stars  in  the  background.  The  situation  is  illustrated  in  Figure  3.4-3 
where  the  stars  are  assumed  to  move  at  2.  9 pixels/sec  for  this  example. 

The  rate  of  threshold  excessions  is  estimated  to  be  less  than  1 percent  in 
the  Galactic  plane  at  a threshold  of  5 watts /sr.  (Ref.  Table  5.4-3.) 

A moving  track  gate  is  established  on  the  basis  of  the  known  star 
velocity.  Those  targets  which  fall  within  the  predicted  position  and  intensity 
range  are  declared  to  be  stars  after  several  frames.  These  targets  are  then 
deleted  from  the  track  file.  All  other  targets  are  classified  as  potentially 
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Figure  3.4  3.  Star  background  discrimination. 


3-52 


i 


s < i 

S LJ 

i 


acceptable  targets  and  transmitted  to  the  module  processor  for  identification. 
A preliminary  algorithm  is  discussed  in  Section  5.3. 

Waveform  Discrimination  Filter 

A waveform  discrimination  filter  can  be  implemented  with  the  tem- 
poral filter  illustrated  in  Figure  3.4-4.  A first-difference  transversal  filter 
is  followed  by  a simple  threshold.  When  a signal  of  positive  polarity  is 
detected,  the  signal  of  opposite  polarity  occurring  in  the  m xt  two  integration 
periods  is  clocked  out  and  thresholded  at  about  80  percent  with  reversed 
polarity  of  the  original  threshold.  As  can  be  seen  from  comparison  of  an 
edge  waveform  and  a point  target  waveform,  the  undershoot  pulse  for  clutter 
is  either  missing,  negligible,  or  greatly  delayed  (until  the  edge  moves  off  the 
cell).  This  technique  is  effective  in  removing  extended  clutter  from  further 
processing  and  while  retaining  all  targets  within  a wide  range  of  target 
velocities . 


Figure  3.4-4.  Waveform  discrimination  technique. 
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Omni-Directional  Time  Delay  and  Integration 


A time  delay  and  integration  (TDI)  filter  determines  the  peak  of  the 
signal  and  successively  adds  these  peaks  for  each  detector  crossing.  In  this 
manner  all  ur.correlated  noise  and  clutter  is  averaged  out,  while  the  signal 
p.  aks,  being  highly  correlated,  add  in  phase.  The  result  is  an  enhancement 
of  the  signal.  The  output  may  be  used  for  both  target  detection  and  intensity 

measurement. 

The  purpose  of  the  TDI  filter  is  to  provide  SNR  enhancement  prior 
to  detection  for  selected  regions  and  velocities  in  pixel  space.  In  the  hand- 
over function  from  the  earth  staring  sensor  to  the  ATH  sensor,  approximate 
track  information  is  available  to  reduce  the  computational  complexity  of  the 
omni-directional  TDI  function. 

Spatial  Detection  Filter 

Spatial  filtering  is  used  to  discriminate  between  targets  and  clutter 
based  on  the  relative  physical  size.  The  targets  are  defined  to  occur  <n  less 
than  two  detector  elements  simultaneously  while  clutter  will  generally  be 
present  in  many  contiguous  detectors.  Since  inadequate  spatial  correlation 
information  (RM-19  data)  is  available,  it  is  difficult  to  assess  the  perfor- 
mance of  a spatial  filter  at  this  time.  However  some  preliminary  data  are 

presented. 

Two  spatial  filter  techniques  are  introduced  in  this  section.  The 
first,  local  area  pixel  processing,  is  a relatively  conventional  approach 
toward  clutter  discrimination.  The  second,  Walsh- Hadamard  processing,  is 
felt  to  be  applicable  to  target  enhancement.  A system  performance /cost 
tradeoff  should  be  conducted  to  select  the  optimum  spatial  filtering  technique 
for  implementation. 

Spatial  Filtering  Via  Local  Area  Pixel  Processing 

Local  area  pixel  processing  detects  the  presence  of  point  targets  by 
sliding  a small  two-dimensional  window  across  the  array  of  pixel  amplitudes. 
The  pixel(s)  at  the  center  of  the  window  is  tested  for  the  presence  of  a possi- 
ble target.  The  window  is  sequentially  moved  across  the  array  such  that  each 
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pixel  is  tested.  The  test  is  based  on  the  fact  that  a target  will  result  in  an 
amplitude  peak  with  respect  to  the  neighboring  pixels. 

The  local  area,  pixel  processing  algorithm  for  a 5 x 5 window  is 
illustrated  in  Figure  3.4-5.  Two  coefficients  are  computet,  for  a line  inter- 
secting the  central  candidate  pixel,  one  representing  targe1;  energy  and  the 
other  representing  clutter  energy.  These  coefficients  are  subtracted  to 
determine  if  a threshold  has  been  exceeded.  The  process  is  repeated  for 
lines  intersecting  the  central  pixel(s)  in  each  of  four  directions:  vertical, 
horizontal  and  the  tv/o  diagonals.  If  the  threshold  is  execeeded  in  all  direc- 
tions a hit  is  reported  to  the  track  processor. 

The  spatial  filter  algorithm  is  based  on  the  computation: 
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Figure  3.4-5.  The  5x5  pixel  group  for  local  area 
pixel  processing. 
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and  where  A = 2n  and  B = 2m. 

Figure  3.4-6  shows  the  response  of  the  filter  to  a point  target  and 
to  a line.  In  addition  to  P.  a "data  valid"  signal  has  to  be  generated.  This 
signal  indicates  that  the  pixel  (i,j)  contains  a potential  target.  This  is 
determined  by  comparing  the  four  with  a threshold  value.  Only  if  all 
four  P exceed  the  threshold  is  the  pixel  considered  to  contain  a potential 

K 

target. 

In  order  to  carry  out  the  computation  of  P.  . dynamically,  it  is 

J 

necessary  to  store  4 lines  of  the  image  and  the  intensity  of  the  24  pixels 
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Figure  3.4-6.  Bocal  area  pixel 
processing  response. 
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ound  (i.j)  must  be  in  registers  that  are  easily  accessible  to  the  arrthmet.c 
units.  The  value  a„  also  has  to  be  available  in  an  analogous  manner  . Four 
adders,  capable  of  adding  5 operands  each,  compute  the  values  P^.  Multi- 
plication by  the  weights  A and  B is  done  by  binary  shifting  since  A and  B are 
restricted  to  be  exact  powers  of  2.  The  spatial  filter  processor  operates  as 

a pipeline. 

Knowing  that  4 words  have  to  be  stored  for  each  potential  target 
(hit)  the  amount  of  time  required  to  process  the  16,384  pixels  of  one 
detector /mux  array  chip  can  be  found  to  be; 


words 

pixel 


5 0 


nsec 

word 


= 3.28 


milliseconds 

chip 


Thus  several  detector/mux  array  chips  may  time  share  one  spatial  filter, 
i e (0.  1/3.28)  * 103  = 30  detector  /mux  array  chips  may  be  processed  by 

one  spatial  filter  processor. 

Digital  Walsh- Ha damard  Transform  Spatial  Filtering 

The  Walsh- Hadamard  Transform  (WHT)  s patial  filtering  technique 
is  based  on  transforming  a block  of  pixel  amplitudes  into  sequences  through 
use  of  a one-dimensional  Walsh-Hadamard  transform.  Each  sequency 
represents  a weighted  sum  of  the  detector  outputs  over  the  transform  block 
length.  Since  only  the  higher  order  sequencies  contain  target  information, 
uhe  lower-ordered  sequencies  can  be  discarded. 

Figure  3.4-7  illustrates  the  operations  required  to  generate  the  first 
eight  sequencies  of  a length  16  Walsh-Hadamard  transform  by  digital  techni- 
ques. The  calculations  are  similar  for  length  128.  By  changing  the  upper- 
most row  of  additional  operators  to  subtractions  the  upper  8 sequencies  can 

be  generated.  ... 

A linear  combination  of  sequency  coefficients  results  in  additiona 

spatial  filtering  by  eliminating  the  periodic  nature  of  the  Walsh-Hadamard 

transform.  Figure  3.4-8  shows  the  operations  which  are  required  to 

enhance  targets  within  two  detector  elements  and  to  suppress  clutter. 
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SPATIAL  POSITIONS 


Fi""re  3.4-7.  Forward  Walsh- Hadamard  transform  operations 
r-quired  to  map  spatial  positions  into  sequencies  of  block  length 

N = 16. 


input  insmoNS 

, -j  3 4 5 6 7 8 9 10  n 12  >3  14  15  16 


Figure  3.4-8.  Walsh- Hadamard  spatial 
filter  operations. 
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Figure  3.4-9  presents  some  data  from  a simulation  used  to  develop  the 
concept. 

The  input  data  in  this  example  was  an  FAA  clutter  map  having  an 
amplitude  distribution  as  shown.  A low  contrast  target  was  superimposed 
on  the  clutter.  The  lower  figure  shows  the  enhancement  which  is  achieved 
when  a length  .1  Walsh  - Hadamard  spatial  filter  is  used.  Note  the  ability 
to  separate  the  target  from  the  background  by  use  of  threshold  detection  after 
transforming. 

In  summary,  a local  area  pixel  processing  and  a digital  WHT  spatial 
filter  can  be  implemented  in  software  in  the  track  processor.  The  compu- 
tational loads  are  quite  modest,  corresponding  to  a throughout  of  1 MIPS  in 
the  track  processor. 

The  spatial  filtering  will  enhance  target  detection.  Further 
simulations  based  on  real  clutter  data  are  required  to  determine  the  relative 
performance  of  both  techniques. 

Target  Detection  Logic 

The  purpose  of  the  target  detection  threshold  circuits  is  to  generate 
hits  for  both  positive  and  negative  amplitude  signals  and  to  transmit  these 
hits  to  the  trackers.  The  trackers  can  handle  up  to  about  a 1 percent  hit 
probability  (targets  and  clutter). 

Feedback  is  provided  from  the  trackers  to  the  threshold  logic  so  that 
a fairly  constant  hit  rate  can  be  maintained  even  while  the  characteristics  of 
clutter  and  sensor  noise  vary  with  time  and  over  the  field  of  view. 

While  adaption  is  a requirement,  the  ability  to  override  that  adaption 
is  also  a requirement.  This  requirement  results  from  at  least  two  system 
requirements  : 

1.  The  requirement  to  give  priority  in  surveillance  to  known 
launch  sites  and  test  areas  even  in  the  face  of  heavy  clutter  in 
those  areas. 

2.  The  requirement  to  maximize  data  collection  on  established 
tracks.  Thus,  during  periods  when  the  target  intensity  dims 
(i.e.  , second  stage  through  third  stage  and  post  boost  thrusting) 
reducing  thresholds  in  the  area  of  the  target  track  is  required. 
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Figure  3,4  -9.  Radar  clutter  map  histogram 
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The  above  requirements  may  lead  to  the  following  implementation 
of  a three  threshold  system: 

The  lowest  threshold  is  called  the  data  rate  threshold  and  is  merely 
used  to  obtain  a count  on  the  number  of  data  points  which  exceed  this 
threshold  each  frame. 

The  intermediate  threshold  is  called  the  track  data  threshold  and 
all  data  points  from  the  filter  processing  unit  which  exceed  this 
threshold  are  available  to  support  the  existing  tracks  in  the  tracker 
unit.  Those  points  which  are  not  associated  with  existing  tracks 
may  be  discarded  if  only  forward  time  track  formation  is  implemented. 

The  highest  threshold  is  called  the  track  initiator  threshold  and  all 
data  points  which  exceed  this  threshold  and  which  have  not  been 
associated  with  an  existing  track  are  utilized  to  start  new  track  files 
if  track  slots  are  available. 

Figure  3.  4-10  is  a block  diagram  of  one  possible  detection  threshold 
multi-level  logic  implementation. 

Track  Processor 

Figure  3.  4-11  shows  a block  diagram  of  +he  track  processor.  It  can 
be  implemented  on  five  LSI  chips  using  1985  projected  technologies.  The 
track  processor  is  a computer  which  features  a throughput  of  8 MIPS,  16  bit 
word  length,  specialized  I/O  ports  and  priority  interrupt  structure. 


DETECTION  TO 
TRACK  PROCESSOR 


NO  TARGETS 
AT  THRESHOLD 
(TO  CONTROL  BUS) 


NO.  TARGETS  AT 
125%  THRESHOLD 
(TO  CONTROL  BUS) 


NO  TARGETS 
75%  THRESHOLD 
(TO  CONTROL  BUS) 


Figure  3.4-10.  Target  detection  logic. 
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Figure  3.4-11.  Track  processor  block  diagram. 

This  computer  has  been  specifically  designed  to  implement  the  track- 
ing function  required  of  APSP.  The  instruction  set  which  is  also  tailored 
to  the  tracking  function,  is  shown  in  Table  3.4-1.  The  major  functions  o 
five  LSI  chips  are:  1)  Micro-programmed  control  unit  (MCU),  an  me 

unit  (ARITH),  3)  input/output  and  sequencing  (1/0,  SEQ)  and  4)  *w° 
access  memories  (RAM.  - one  memory  accepts  hit  data  from  the  detection 
logic;  the  other  memory,  containing  the  stored  program,  could  be  a PRO  . 

The  arithmetic  portion  contains  128  general  registers,  an  arithmetic 
logic  unit,  a multiply  network,  and  related  functional  units.  A block  diagram 
of  this  unit  is  shown  in  Figure  3.  4-12.  The  multiply  network  allows  e 
parallel  multiplication  of  two  16-bit  operands  during  two  machine  cycles. 

The  sequencing  and  I/O  chip,  shown  in  Figure  3.4-13,  issues 
addresses  to  memory  for  the  purpose  of  fetching  instructions  and  operan  s. 
This  unit  contains  the  arithmetic  capability  to  perform  address  calculations. 
It  also  contains  the  interrupt  structure  with  provisions  for  seven  levels  of 
priority  interrupts.  An  interrupt  stack  is  provided  that  allows  seven 
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TABLE  3.4-1.  THE  INSTRUCTION  SET  OF  THE 
TRACK  PROCESSOR 


DATA  MOVING 

LOGICAL 

SHIFT 

Load 

And 

Arithmetic 

Store 

Or 

Arithmetic  Double 

Exchange 

XOR 

Rotate 

Block  Load 

l's  Complement 

Logical  O-Ext. 

Block  Store 

ARITHMETIC 

Logical  1-Ext. 

Reg.  to  Acc, 

INTERRUPT  HANDLING 

Acc.  to  Reg. 

Add 

Set  Trap  Address 

Load  Immediate 

Subtract 

Multiply 

Divide 

Set  Mask  Register 
Resume 

Increment  Reg. 

MISCELLANEOUS 

Decrement  Reg. 

Halt 

No-Op 

Reset  and  Start 

BRANCHES 

I/O 

Unconditional 

Output  to  Vector  Buffer 

If  Reg  = 0 

Output  to  Signal  Processor 

If  Reg  = ACC 

Output  to  Neighbor 

If  Reg  >ACC 
If  Reg  < ACC 

Input  From  Neighbor 

Incr  Acc.  and  Branch  If  5 
If  OFL  on  Previous  Instr. 

Reg 

Figure  3.4-12.  Arithmetic  chip. 

levels  of  nested  interrupts.  The  sequencing  and  I/O  chip  also  interfaces  witn 
the  Track- Bus  and  the  filter  and  detection  processor  for  the  purpose  of 
issuing  track  file  data  and  adaptive  thresholds,  respectively,  The  micro- 
program control  chip,  shown  in  Figure  3.4-14,  supplies  the  detailed  control 
signals  required  to  operate  the  track  processor.  It  consists  of  a 512  x 32-bit 
ROM.  During  each  machine  cycle  one  word  is  fetcned  from  the  ROM  and 
placed  in  the  Command  Register.  Subfields  within  this  register  are  used 
directly  or  decoded  to  provide  the  necessary  control  signals.  Logic  is 
also  provided  to  generate  the  address  of  the  next  control  word  to  be  fetched 
from  the  ROM.  This  consists  primarily  of  an  address  multiplexer  and  flag 
select  logic. 
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Figure  3.4-14.  Microprogram  control  unit. 

Atypical  memory  is  illustrated  in  Figure  3.4-15.  Three  regions  of 
memory  are  required  for  the  processor.  First,  a region  containing  5K 
locations  is  used  to  store  programs,  constants,  and  variables.  Second, 

2K  locations  are  used  as  a double  buffer  to  store  target  HIT  reports.  Third 
IK  locations  are  used  to  store  selected  raw  data  from  the  filter  processor. 

The  track  processor  is  programmed  to  perform  the  following  tasks: 

• Target  tracking 

• Clutter  tracking  and  deletion 

• Adaptive  temporal  filtering 
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Figure  3.4-15.  Memory  chip. 

• Spatial  filtering 

• Position  space  reconstrv  ction  from  the  analog  WHT  pre-processor 

• Control  of  detection  threshold 

• Control  of  AVE  dynamic  range  modes 

The  use  of  a dedicated  track  processor  (one  per  detector/mux  array) 
is  the  only  available  technique  for  most  of  the  adaptive  functions  in  the  APSP 
since  all  classical  filtering  algorithms  lack  the  required  required  flexibility. 

Submodule  Processor  Chip  Partitioning 

Based  on  the  projected  power  and  densities  of  future  LSI  implementa- 
tions, the  system  submodule  consists  of  1Z  custom  LSI  chips  interconnected 
with  a hybrid  package.  The  organization  of  the  chips  is  illustrated  in 
Figure  3.  4-16. 


J 
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Signals  from  the  detector/mux  array  are  encoded  as  ten-bit  words 
on  a CCD/CMOS  chip  which  interfaces  with  the  digital  AVE  electronics. 
Commands  are  created  in  the  AVE  digital  chip  and  are  transmitted  to  the 
control  processor  chip  which  contains  the  dynamic  range  algorithms  (and 
over-rides)  which  optimize  the  S/N  ratio  for  the  various  missions. 

The  temporal  and  spatial  detection  filters  and  adaptive  detection 
threshold  logic  are  combined  on  one  chip  with  the  AVE  digital  logic.  Asso- 
ciated with  the  large  random  logic  chips  are  serial  (block  organized)  mem- 
ories which  use  serial-parallel-serial-parallel-serial  CCD  devices  for  nuclear 
event  suppression  and  temporal  filtering. 

The  track  processor  in  the  APSP  is  composed  of  five  advanced 
technology  LSI  chips  with  random  logic  densities  approaching  30,000  gates 
per  chip  (200  x 200  mil).  This  density  is  made  possible  through  high  resolu- 
tion projection  photolithography  and  electron  beam  microfabrication  techniques 
while  power  dissipation  is  held  well  within  thermal  power  density  limits  by  the 
use  of  low  power  devices. 

Table  3.  4-2  if  a tabulation  of  the  nine  special  chips  that  make  up  the 
twelve  chip  submodule  processo  • in  a hybrid  package.  Estimated  chip  power 
consumption  and  number  of  Input/Output  chip  pads  are  shown. 
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TABLE  3.  4-2.  SUMMARY  OF  THE  APSP  HARDWARE 
CHARACTERISTICS 


Chip  Type 
(Number  Required) 

Number  of 
Equivalent  Gates 

Povve  r 

Consumption 
Per  Chip 

Number  of 
I/O  Pads 

Analog  AVE  (1) 

— 

81  mW 

24 

Digital  AVE  (1) 

9,  400 

5 mW 

28 

Serial  Memory  (3) 

320, 000 

14  mW 

18 

Arithmetic  (1) 

25, 000 

42  mW 

38 

Sequencing  and  I/O 
(1) 

16, 000 

5 3 mW 

79 

Microprogram 
Control  (1) 

8,400 

21  mW 

44 

RAM  Memory  (2) 

82, 000 

15  mW 

59 

Bus  Driver  ( ' ) 

5,  000 

100  mW 

30 

Voltage /Bias 
Regulator  (1) 

_ 

50  mW 

36 

4.0  DESCRIPTIONS  OF  FUNCTIONAL  UNITS 


This  section  describes  in  detail  the  functional  units  of  the  proposed 
APSP.  These  units  are  the  i)  adaptive  video  encoder  with  dynamic  range 
control,  2)  temporal  detection  filter,  3)  spatial  detection  filter  using  hot  i 
local  area  pixel  processing  and  Walsh  Hadamard  transform  processing, 

4)  target  detection  logic  and  5)  the  track  processor.  A number  of  symbols 
are  used  frequently  in  the  discussions  which  follow.  For  convenience,  those 
symbols  are  defined  below. 

DEFINITION  OF  SYMBOLS 

Q Subscript,  represents  digital  value 

PQ  Predicted  digital  word 

P Analog  predicted  word 

S Analog  input  from  detector/mux  array 

E Analog  difference  signal 

Eq  Digital  difference  signal 

Q Quantization  level 

o 

PC  Digital  predicted  corrected  value 

(Output  from  encoder) 

k Discrete  time  index 

n Number  of  taps 

Tp  Dwell  time  on  detector 

T Sample  period 

m mth  derivative 
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4.  1 The  Adaptive  Video  Encoder  (AVE) 


As  shown  in  Figure  4.  1 — 1,  the  AVE  functions  as  the  APSP  interlace 
vith  the  detectors  of  the  Monolithic  Focal  Plane  Array  (MFPA).  The  AVE 
receives  data  from  the  MFPA,  encodes  the  data  in  digital  form,  and  passes 
it  to  the  LAP  for  further  processing.  The  AVE  also  receives  control  com- 
mands and  digital  feedback  from  the  LAP,  senses  saturation  in  the  system, 
and  provides  adjustment  control  and  mode  change  operations. 

The  AVE  encodes  the  MUXed  analog  signal  from  the  MFPA  chip.  The 
resulting  word  is  a 10-bit  representation  of  the  signal.  The  encoder  utilizes 
a prediction  of  the  next  signal  value  based  on  previous  values  (rather  than  on 
the  statistical  properties  of  the  signal)  to  generate  a difference  signal  which 
is  then  encoded  and  added  to  the  prediction  to  give  the  encoded  signal.  The 


dynamic  range  of  the  encoder  is  1023:1  (60  dB). 

Figure  4.  1-2  is  a block  diagram  of  the  prediction  feedback  encoder. 

The  10-bit  predicted  word  Pq  is  converted  to  its  analog  counterpart  P and 

subtracted  from  the  MFPA  signal  S via  control  of  the  fat  zero  level  in  the 

CCD  channel.  A difference  signal  E is  thus  generated  which  is  encoded  as 

E„.  This  takes  place  in  either  channel  'a'  or  channel  ' b ' . Assuming  MFPA 
Q c 

signals  between  0 and  1,  the  2 -1  quantization  levels  in  both  A/D  converters 
are  determined  as  follows: 

The  10  bit  D /A  converter  generates  analog  signals  between  0 and  1; 
thus  the  LSB  in  the  10-bit  word  applied  to  this  D/A  must  represent  a value 

of  1/(210-1).  The  LSB  in  the  output  word  formed  bv  channel  'b ' of  the  2-chan- 

nel  ADC  must  also  represent  this  value.  Since  this  is  a 5-bit  A/D  converter, 

5 10 

the  largest  value  it  can  convert  without  exceeding  saturation  is  (2  -l)/(2  -1). 

Also  since  for  channel  'b1,  E is  amplified  by  2 , the  saturating  value  needs 
to  be  scaled  by  the  same  amount.  Thus  the  channel  1 b 1 A/D  can  represent 
a full  scale  amplitude  of  Z5(  25 - 1)/ (21  ° - 1 ) and  because  it  is  a 5-bit  A/D  its 
quantization  level  will  be  this  full  scale  voltage  divided  by  (2  -1)  or 


Q 


o 


(1) 


4-2 


ANALDG 

SAMPLES 


DIGITAL 

DATA 


TARGET 

ADDRESSES 


TARGET 

ADDRESSES 


FDCAL 

PLANE 

ARRAY 


¥ 

INDEPENDENT 

PIXEL 

LIMITED 

SPATIAL 

TARGET 

DATA 

TARGET 

DATA 

ADAPTIVE 

VIDED 

ENCDDER 

ELEMENT 

PROCESSING 

ARRAY 

CORRELATION 

PROCESSING 

ARRAY 

RDUTING 

ARRAY 

l 

1 

VJT  LAYER 

2ND  LAYER 

3RD  LAYER 

TRACKING 

ARRAY 

AND 

INTERFACE 

UNIT 


4TH  LAYER 


AVE 


•h 


LAYERED  ARRAY  PROCESSOR 


h 


APSP  ARCHITECTURE  "A"  - 

Figure  4.  1-1.  APSP 


GAIN 

NORMALIZATION 


Figure  4.  1-2.  Prediction  feedback  encodei 


4-3 

( 

J 


*j«L 


The  channel  'a'  A/D  saturation  value  is 


PI  SAT  = ^ - ‘»Q„- 

Only  the  magnitude  of  E will  be  applied  to  the  A/D  converters  so  that  Eq 
must  have  a sign  bit.  Now,  whenever 

0 < |E|  < |E,..at/25 

the  5 -bit  word  from  channel  'b1  will  be  placed  in  bits  1-5  (LSB)  of  Eq 
(Bits  6-10  will  be  zero).  Tnis  is  just  a normal  5-bit  A/D  conversion  of  E. 
However,  when 


IeIsat/2  < ,E|  £ |e|sat 


the  5-bit  result  from  channel  'a'  will  be  placed  in  bits  6-10  (MSB)  of  E^ 

(bits  1-5  will  be  zero).  The  result  will  be  an  approximate  representation  of 
E with  an  error  of  magnitude  no  greater  than  E g ^t^3.  This  two  channel 
scheme  is  necessary  to  allow  the  encoder  to  recover  from  large  errors  in 
predicting  the  signal.  Eq  is  now  added  to  the  10- bit  predicted  word 
forming  the  predicted-ccrrected  value  PC^  which  is  a 10-bit  representation 
of  the  MFPA  signal  S and  which  is  then  used  in  predicting  the  next  value  of 
S and  as  output  to  the  LAP. 

The  accuracy  of  the  output  word  PC^  depends  on  the  accuracy  in  the 
A/D  conversion  of  E;  the  D/A  conversion  of  P.^;  and  the  analog  summation 
S-P.  The  quantization  noise  introduced  bv  nversion  of  E will  dominate 

if  it  is  assumed  for  purposes  of  analysis  t at  the  analog  summation  is  ideal. 

The  optimum  predictor  will  produce  uflerence  signals  that  are  no 
greater  than  5-bits  such  that  when  Eq  and  Pq  are  added,  the  resultant  PCq 
will  represent  the  MFPA  signal  S within  the  accuracy  of  the  A/D.  The  pre- 
dictor must  also  have  few  delays  since  a memory  will  be  required  for  each 
pixel  element  for  each  delay. 


ife-s  si- 
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The  Predictor 


A number  of  predictors  including  geometric  feedback,  AD  PCM 
(adaptive  differential  pulse  code  modulation),  and  least  mean  square  cubic 
types  were  considered  during  the  study  and  the  polynomial  fit  predictor  was 
found  to  be  optimum  among  those  considered. 

From  Newton's  backward  difference  formula,  for  an  n-point  prediction 


vxs 


k-l-i 


where 


= sk 

Visk  = vulsk- 

This  may  be  rewritten 


a.S, 

1 k-i 


(2) 


whe  re 


a. 

1 


n i_  I 

: i )(- 1)  , 


The  bound  on  the  error  magnitude  |S^  - | will  be  proportional  to 

MAX  [s(n+1)(6)],  (k-n)T  < 6 < kT.  Thus  for  rapidly  changing  signals,  the 
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error  can  become  large.  For  example,  a unit  step  function  can  be  considered 
a worst  case  situation:  for  such  a signal  the  predicted  values  are 


k < 0 

k = 1,2,.  . . , n- 1 
k > n 


and  the  resulting  difference  signal  is 


k = 0 

k = 1,  2,  . . . , n - 1 
elsewhere 


(3) 


Thus  the  prediction  is  very  far  from  the  signal  value  beginning  at  the  dis- 
continuity takes  n samples  to  recover  - a ringing  effect  essentially. 

Table  4.  1-1  shows  the  results  of  Equation  (3)  for  I < n s 5. 

Figure  4.  1-3  shows  the  implementation  of  Equation  (2).  A total  of 
n-delays,  n-multiplie rs , and  an  n-input  adder  are  required. 

Referring  to  Figures  4.  1-2  and  4.  1-3,  the  z-domain  equations  can 
be  written  to  determine  the  system  frequency  characteristics  (neglecting 
quantization).  There  are 

E(7)  = S(Z)  - P(Z) 
n 

p(Z)  = ^a.PC(Z)Z_1  £ H(Z  )PC  (Z ) 
i=l 

PC(Z)  = E(Z ) + P(Z)  = S(Z). 
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TABLE  4.  1-1.  DIFFERENCE  SIGNAL  RESPONSE  TO  UNIT  STEP 
INPUT  FOR  N- POINT  PREDICTORS 


Figure  4.  1-3,  n-Point  polynomial  predictor. 


Thus 


Hd(Z)  £ E (Z  )/S(Z ) = 1 - H(Z) 


where  the  predictor  transfer  function  is  given  by 


n 


H(Z ) = ^ aiZ_1‘ 


i = 1 


The  magnitude  characteristic  | Hie1"1  )|  is  plotted  in  Figure  4.  1-4. 

Now  from  the  binomial  theorem  and  Equation  (2),  the  encoder  trans- 

fer  function  is 


hd(Z) 


■H-T 


The  magnitude  characteristic  is 


HD(e 


iu)T 


i|  = 2n 


uT 

SIN“ 


n 


and  is  plotted  in  Figure  4.  1-5.  A maximum  value  of  2 occurs  at  wT  tt 
(or  f = 1/2T).  Because  of  the  optical  system  characteristics,  the  signal  will 
be  approximately  band  limited  to  ±1/td  where  td  is  the  dwell  time  (time  for 
a point  to  move  across  a detector  cell).  If  ^ > 2T  to  prevent  aliasing,  then 
the  predictor  will  not  become  unstable  at  large  values  of  n.  The  tradeoff  is 
between  a large  n for  predictor  accuracy  (recall  that  the  error  is  proportional 
to  the  (m+1) st  derivative  of  the  signal)  and  a small  n to  minimize  the  error 
for  signals  with  significant  energy  in  the  region  around  1/td.  In  favor  of  a 
smaller  n is  the  better  impulse  response  characteristic  (Table  4.  1-1)  and 
lower  complexity. 
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Figure  4.  1-4.  Frequency  characteristic  of 
n-point  predictor. 


The  sum  of  the  squared  error  is  given  by 


Zkl2  * 


A = k/ 


E(Z)E(Z'1)Z“1dZ 


If  the  worst  case  signal  is  assumed  to  be  the  unit  ste"  then 


S(Z)  = (1  - Z"1)"1 


1 


j 


I! 


u 


■ * -a 


Encoder  Definition 


The  encoder  will  use  the  1 or  2 point  polynomial  fit  predictor.  Pre- 


vious values  of  the  encoded  signal  PCq  will  be  stored  in  a 10n(n  =1,2)  shift 


register  — one  for  each  pixel  element  on  the  MFPA.  This  n-frame  memory 
may  be  accessed  for  later  processing  and  for  MFPA  dynamic  range  control 
scaling.  By  doubling  the  memory  size,  two-color  input  schemes  can  be 
readily  accommodated. 

The  predictor  processor  will  also  take  advantage  of  a priori  knowledge 
of  the  MFPA  signal  dynamic  range  (O-i)  in  that  all  predictions  falling  outside 
this  range  will  be  clamped  at  the  appropriate  boundary.  In  addition  to 
improving  predictor  accuracy  this  insures  that  the  dynamic  range  of  the  A/D 
converter  is  not  exceeded  and  hence  greatly  enhances  the  unit  step  response 
of  the  encoder.  Since  seme  predictor  weighting  coefficients  are  larger  than 
unity,  intermediate  weighted  sums  may  exceed  10-bits  for  maximum  signal 
inputs;  thus  the  predictor  adder  and  result  register  will  require  11  bits.  The 
extra  bit  will  be  used  to  determine  any  overflow  and  in  that  event  will  set  the 
10  LSB  to  all  ones  and  apply  the  1 0- bit  result  to  the  D/A  and  to  the  + P^ 
adder.  At  tht  lower  end  of  the  dynamic  range,  negative  predictor  values  will 


be  set  to  zero  before  being  applied  to  the  D/A  and  to  the  adder. 

14 


For  a maximum  frame  rate  of  1 0 frames /second  and  2 pixels /MFPA, 
the  conversion  time  of  the  encoder  must  be  no  greater  than  6.  1 psec.  The 
cost  of  the  encoder  can  be  determined  from  a list  of  its  component  parts;  for 
each  MFPA  these  are: 


1 2-channel  A/D  converter  (E  to  E^) 


1 10-bit  D/A  converter  (Pq  to  P) 


1 Ana’og  summer  (S-P) 
1 X32  amplifier 


1 10-bit  plus  sign  bit  adder  (Eq  + Pn) 


.14 


Q 


1 lOn  (2  )-bit  memory/color 


1 n-tap  digital  transversal  filter  (11  bits)  with  underflow  and 
overflow  logic. 
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4.2  Temporal  Filter 


The  temporal  filter  (TF)  will  be  realized  by  a third  difference  digital 
filter.  This  design  is  based  on  performance  requirements  in  Section  6.  2. 
The  relation  between  input  and  output  is  given  by 


f .(n)  = f.  (n) 
out  in' 


3f.  (n-1)  + 3f.  (n- 2)  - f (n-3). 
in  m in 


This  filter  requires  3 frames  of  memory. 

For  a single  velocity  range,  one  implementation  oi  a difference 
equation  is  sufficient.  For  each  added  velocity  range,  or  integration 
period,  one  additional  realization  of  the  filter  equation  must  be  implemented. 

The  implementation  poses  a significant  problem:  serial  versus 

parallel  processing.  Parallel  processing  presents  a pin  limitation  — 
versus  — density  dilemma.  For  4 inputs  and  one  output  of  16  bits  each,  a 
minimum  of  80  pins  is  required.  However,  the  logic  required  is  far  below 
500  gates.  This  would  create  a tremendous  waste  of  volume  in  the  Signal 
Processor.  However,  handling  the  data  in  a serial  manner  will  not  only 
reduce  the  number  of  pins  per  package,  but  allow  many  TF's  to  be 
fabricated  on  a single  chip.  This  is  also  compatible  with  the  serial 
implementation  of  the  AVE. 

The  implementation  of  the  TF  is  as  follows  and  depicted  in 
Figure  4.2.  1 . 

For  3rd  difference:  2 Adders  = 75  gates  each,  difference  =75  gates  = 
225  gates/3rd  difference  filter. 

For  each  additional  velocity  bin,  two  memory  chips  plus  one  section 
of  a TF  chip  are  required. 

4.  3 Spatial  Filter 

Spatial  filtering  refers  to  processing  which  is  done  on  a single  irame  of 
MFPA  data  for  the  purpose  of  locating  pixels  illuminated  by  targets.  Two 
classes  of  spatial  filtering  algorithms  have  been  proposed:  local  area  pixel 
processing  and  Hadamard  spatial  filtering.  Local  area  pixel  processing  is 
the  name  given  to  the  class  of  algorithms  in  which  the  output  corresponding 
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a) 2ND  DIFFERENCE 


b)  MATCHED 


Figure  4.2-1.  Temporal  filter  (serial). 


to  a particular  pixel  is  a function  of  its  input  amplitude  and  that  of  a small 
number  of  nearest  neighbors.  This  method  is  discussed  first.  Hadamard 
spatial  filtering  is  a technique  whereby  a block  of  pixel  amplitudes  are 
transformed  into  the  sequency  domain,  low-order  sequences  are  discarded, 
then  a type  of  inverse  operation  is  performed  on  the  retained  sequences  to 
obtain  the  output.  The  Hadamard  algorithm  is  described  and  a computation- 
ally equivalent  algorithm  is  presented  which  falls  into  the  category  of  local 
area  pixel  processing  as  defined  above. 


Local  Area  Pixel  Processing 

Local  area  pixel  processing  algorithms  use  a small  window  which 
is  moved  across  the  array  of  MFPA  data.  At  a given  time  the  small  aggre- 
gate of  pixels  contained  within  the  window  are  used  to  calculate  the  output 
corresponding  to  the  central  pixel.  This  concept  is  implemented  by  shifting 
pixel  amplitude  data  through  the  spatial  filter  such  that  each  pixel  is  pro- 
cessed as  a central  point.  Specific  implementations  of  this  type  of  spatial 

filter  are  described  here 
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Method  1 is  illustrated  ia  Ftgure  4.  3-1.  The  four  nearest  neighbors 
on  the  diagonals  are  summed  and  multiplied  by  a weighting  factor CA'  * 
four  next  nearest  neighbors  on  the  diagonals  are  also  summed  and  multipUed 
by  a weighting  factor  C*  These  terms  are  then  subtracted  from  the  central 

...... .. .... .. — . •• — - 

>un.  a.  .......  ".in....  ..  n. ..... ...  ""  n.  dliso““  a" 

tracted  from  the  central  pixel  to  form  the  output. 
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Figure  4.  3-1.  Spatial  filter  method  1. 
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Figure  4.  3-2.  Spatial  filter 
method  2. 
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The  algorithm  shown  in  Figure  4.  3-3  groups  neighboring  pixels 
according  to  their  proximity  to  the  center.  Weighted  sums  of  each  group  are 
subtracted  from  the  central  pixel  to  form  the  output. 

To  obtain  a rough  idea  of  the  response  characteristics  of  these 
filtering  algorithms,  some  calculations  were  performed  which  simulate  their 
response  to  simple  spatial  features.  Both  a point  step  and  a line  step  were 
passed  through  each  filter  along  the  diagonal  and  along  the  horizontal  as 
shown  in  Figure  4.  3-4.  The  resulting  response  curves  are  shown  in  Fig- 
ures 4.  3-5,  4.  3-6  and  4.  3-7.  The  coefficients  chosen  for  each  of  these 
examples  are  shown  in  the  figures.  Note  that  each  of  the  filter^  exhibits  a 
significant  response  to  spatial  lines.  Thuf  spatial  lines  of  a given  amplitude 
have  the  effect  of  appearing  as  points  of  smaller  amplitude. 


Figure  4.  3-3.  Spatial  filter 
method  3. 
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Figure  4.  3-5.  Method  1 
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Figure  4.  3-6.  Method  2. 
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Figure  4.  3-7.  Method  3. 

The  basic  goal  of  spatial  filtering  is  to  detect  tue  presence  of  targets 
and  to  suppress  background  clutter.  The  absolute  amplitude  of  targets  must 
be  preserved  during  this  process.  For  a given  filter,  a tradeoff  exists 
between  preserving  target  amplitude  and  responding  only  to  point  targets.  It 
may  not  be  possible  to  perform  both  of  these  functions  satisfactorily  with  a 
single  filter.  An  alternative  approach  is  to  provide  two  filters,  one  for 
detection  and  one  for  background  suppression.  This  concept  is  shown  in 
Figure  4.  3-8.  The  output  of  the  detection  filter  is  used  to  gate  the  output 
of  the  background  suppression  filter. 
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Figure  4.  3-8.  Spatial  filtering  using  sepaiate  filte 
for  detection  and  suppression. 
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A possible  implementation  of  this  scheme  is  shown  in  Figure  4.  3-9. 
Four  one-dimensional  filters  are  provided  which  are  aligned  along  the 
horizontal,  the  vertical  and  the  two  diagonals.  The  output  of  each  filter  is 
tested  for  a threshold  excession  to  detect  the  presence  of  an  amplitude  peak 
at  the  center.  If  all  thresholds  are  exceeded  then  the  outputs  of  the  four 
one-dimensional  filters  are  combined  to  form  the  output  corresponding  to 
the  center  pixel.  With  this  implementation  both  the  detection  filter  and  the 
suppression  filter  share  the  same  arithmetic  operations. 
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Figure  4.  3-9*  Spatial  filter 
method  4. 
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Hadamard  Spatial  Filtering 


A block  diagram  o£  Hadamard  spatial  filtering  is  shown  in  Figure  4.  3-10 
This  discussion  is  limited  to  Hadamard  spatial  filtering  in  one  dimension, 
although  the  technique  can  be  extended  to  two  dimensions.  First,  a forward 
Hadamard  transform  is  performed  on  a block  of  pixel  amplitude  data.  The 
block  Length,  N,  is  expected  to  be  equal  to  the  length  of  a single  row  of  MFPA 
data  which  is  128.  Spatial  filtering  is  achieved  by  eliminating  the  low-order 
sequences,  so  only  sequencies  32  through  127  are  generated  when  performing 
the  forward  transform.  Weighted  sums  of  the  sequences  are  generaced  by  the 
position  filter.  These  sums  represent  the  target  energy  incident  on  a pair  of 
adjacent  pixels. 

Figure  4.  3-11  shows  the  operations  recuired  for  a forward  Hadamard 
transform  of  block  length  16.  Pixel  amplitudes  at  the  top  of  the  diagram  are 
combined  through  addition  and  subtraction  to  form  the  sequency  coefficients. 
This  figure  shows  only  half  the  operations  required  for  a full  transform.  The 
same  structure  with  the  top  row  of  addition  operators  changed  to  subtraction 

operations  results  in  sequencies  8 through  15. 

Once  the  sequencies  have  been  generated,  a type  of  inverse  operation 
is  performed  to  determine  position  amplitudes  as  shown  in  Figure  4.  3-12. 

Each  node  labeled  with  a double  subscript  represents  target  energy  incident 
on  the  corresponding  pair  of  pixels  within  the  block. 

If  the  diagrams  in  Figures  4.  3-11  and  4.  3-12  are  combined  into  a 
single  network  and  the  operations  are  minimized,  Table  4.  3-1  is  obtained. 

Each  row  in  this  table  represents  the  computationally  equivalent  operations 
which  must  be  performed  on  the  input  amplitudes  to  obtain  output  amplitudes 
directly  without  performing  a Hadamard  transform.  For  example,  to 
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Figure  4.  3.  10.  Hadamard  spatial  filter. 
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Figure  4.  3-12.  Position  filter. 


obtain 


P(l,2) 


positions  three  and  four  are  subtracted  from  positions  one  and 


two  as  follows: 


P(l,2)  = 4<J1+J2>  - 4<J3  + J4> 


where 

J\  = amplitude  of  pixel  i . 

Note  that  either  two  or  four  of  the  neare st  neighbor s are  required  to  generate 
the  position  amplitudes  making  this  approach  similar  to  local  area  pixel 
processing. 

Table  4.  3-2  shows  the  number  of  arithmetic  operations  required  for 
each  of  the  various  phases  of  Hadamard  spatial  filtering.  A substantial  com- 
putational savings  results  from  generating  position  filter  values  directly,  at? 
shown  in  Table  4.3-1,  as  opposed  to  the  two  phase  Hadamard  approach. 

Hardware  Implementation 

v 

Figure  4.3-13  shews  a block  diagram  of  a local  area  pixel  processor 
(spatial  filter).  This  processor  accepts  digitized  pixel  amplitude  data  as 
input  and  performs  sequential  arithmetic  operations  to  generate  target  detec- 
tion reports.  Detection  thresholds  are  variable  and  are  specified  by  the 
track  processor.  The  line  storage  memory  stores  several  lines  of  pixel 
amplitudes  so  that  all  pixels  within  a sliding  window  are  accessible  simul- 
taneously. (For  a 5 x 5 window,  four  lines  must  be  stored.)  The  local  stor- 
age RAM  is  a fast  access  memory  used  to  store  temporary  variables  and 
target  detection  records.  The  arithmetic  and  logical  operations  required  to 
execute  the  spatial  filtering  algorithms  are  performed  by  the  arithmetic  unit. 
The  threshold  detection  target  detection  registers  comprise  the  interface  with, 
the  track  processor  and  receive  threshold  settings  and  report  target  detec- 
tions, respectively. 

For  a frame  rate  of  10  Hz  the  processor  must  perform  the  calcula- 
tions associated  with  a single  pixel  location  within  approximately  6 psec. 

The  algorithms  discussed  in  the  previous  sections  require  from  5 to  50 
operations  per  pixel.  Assuming  an  overhead  of  100  percent,  this  results  in 


Figure  4.  3-13.  Spatial  filter  processor. 

a range  of  1 0 to  100  operations  per  pixel.  This  estimate  imposes  a 
performance  requirement  for  a processor  cycle  time  which  ranges  from  60 
to  600  nsec,  if  the  operations  are  to  be  performed  sequentially.  (Processor 
cycle  time  refers  to  the  time  required  to  perform  a single  operations  such  as 
addition  or  subtraction.) 

The  hardware  complexity  of  a sequential  spatial  filter  processor  of  the 
type  shown  in  Figure  4.  3-13  is  estimated  to  be  30,  000  to  40,  000  gate  equiva- 
lent*,. This  assumes  a 5 x 5 window  size.  A similar  result  could  be  obtained 

with  a CCD  implementation. 

Conclusion 

A sequential  processor  capable  of  performing  the  spatial  filtering 
associated  with  a single  MFPA  chip  appears  feasible  for  the  algorithms 
described  here.  Further  analysis  is  required  before  an  optimum  algorithm 
can  be  selected. 

An  appropriate  target/ clutter  model  must  be  formulated  so  that  the 
optimum  coefficients  can  be  obtained  and  a performance  estimate  made 
for  each  of  the  algorithms.  An  algorithm  can  then  be  selected  on  the  basis 
of  performance  vs.  complexity  (number  of  operations  per  pixel).  The 
spatial  filter  implementation  should  minimize  the  number  of  operations  per 
pixel  by  sharing  arithmetic  operations  among  the  required  subfunc lions 
(i.  e.  , clutter  suppression,  target  detection,  and  thresholding). 
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4.  4 Track  Processor 


This  section  describes  the  tracking  multiprocessor  system  and  the 

individual  Microprocessor  Trackers  (pPT)  which  comprise  the  system  that  is 
positioned  between  the  APSP  Signal  Processor  and  the  Data  and  Control  Proces 
sor  (Figure  4.  4-1).  The  Track  Processor  receives  filtered  data  in  the  form  of 
"hits"  (potential  targets)  from  the  Signal  Processor.  These  data  unde-gc 
a special  sorting  procedure  (commonly  called  tracking)  in  the  Track  "Jfo- 
cessor.  Hits  that  appear  to  be  the  logical  continuations  of  target  tracks  are 
used  to  update  those  tracks,  while  those  that  are  found  to  be  clutter  are  dis- 
carded. Data  describing  all  tracks  currently  being  monitored  is  sent 
periodically  to  the  Data  and  Control  Processor.  Thresholding  and  algo- 
rithm selection  commands  may  be  sent  back  to  the  Signal  Processor. 

The  volume  of  computation  for  the  Track  Processor  can  be  estimated 
from  the  expected  hit  rate  which  in  turn  if  estimated  to  be  about  1 percent 
(i.  e.  , 1 of  every  100  pixels  in  the  MFPA  is  reported  as  a hit  by  the  Signal 
Processor).  In  the  case  of  the  largest  contemplated  MFPA  (108  pixels), 
the  volume  of  computation  is  too  large  to  be  handled  by  a single  central 
processor.  For  this  reason  a number  of  multiprocessing  and  array 
processing  methods  have  been  examined. 


CONTROL 


CONTROI 


Figure  4.4-1.  Position  of  the  tracking  multiprocessor  in  the  system. 
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Two  basic  approaches  have  been  examined: 

1.  dynamic  assignment  of  processors  to  tracks,  and 

2.  a priori  assignment  of  processors  to  fixed  regions  of  the 
focal  plane. 

While  the  first  of  these  approaches  is  attractive  because  of  dynamic  resources 
allocation,  it  also  presents  difficultes  in  the  areas  of  communication  and 
fault  tolerance.  Such  a solution  requires  universal  communication  bet- 
ween processor  and/or  a central  supervisor  coordinating  all  processors. 
The  difficulty  with  this  approach  arises  from  the  size  of  the  communica- 
tion networks  required  and  from  the  lack  of  fault  tolerance  of  systems  with 
central  supervisors.  For  these  reasons  the  latter  approach  has  been 
chosen. 


The  King- Connected  Array  Processor 

In  the  King-Connected  Array  Processor  one  processing  element  is 
assigned  to  each  MFPA  chip  (128  x 128  pixels).  Eacn  processor  communi- 
cates with  8 neighboring  processors  (like  the  moves  of  a king  on  a chess- 
board; hence  the  name  of  the  configuration)  (Figure  4.4-2).  The  sole 
purpose  of  interprocessor  communications  is  to  handle  tracks  that  cross 
MFPA  chip  boundaries. 

Since  the  Signal  Processor  is  also  partitioned  into  single  processing 
elements  for  eac.  MFPA  chip,  it  is  only  necessary  for  each  processing  ele- 
ment of  the  King -Connected  Array  to  communicate  with  the  corresponding 
element  of  the  Signal  Processor. 

Periodic  updates  of  track  status  are  sent  from  each  processor  ele- 
ment to  the  Data  and  Control  Processor  over  a bus  referred  to  as  the  Track 
Bus.  Each  processing  element  is  periodically  interrogated,  at  which  time 
it  reports  the  status  of  all  the  tracks  it  is  monitoring. 

* 

The  Microprocessor-Tracker  (fiPT) 

Each  processing  element  of  the  King-Connected  Array  is  a micro- 
processor and  is  referred  to  as  a Microprocessor  Tracker  (pPT). 

Figure  4.4-3  shows  the  connections  between  a pPT  and  its  environ- 
ment. In  addition  to  the  flows  of  data,  the  pPT  receives  external  interrupts 
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Figure  4.  4-3.  Data  flow  in  the  ia.PT. 


and  can  be  loaded  with  a new  program  from  the  Data  and  Control 
Processor. 

The  pPT  is  a microprocessor  tailored  to  the  task  of  implementing 
one  processing  element  of  the  Track  Processor.  For  this  reason  its 
architecture  exhibits  special  features  not  found  in  commercially  available 
mic  roproces  sor  s . 

Special  Features 

Generally,  the  pPT  is  a bus -organized  16  bit  processor.  The  special 
features  of  pPT  are: 

• 128  fast  access  (1  cycle)  registers  located  on  the  arithmetic 
chip, 

• 16  x 16  bit  fully  parallel  multiply  network  (2  cycles), 

• a 7 -level  priority  interrupt  structure, 

• autonomous  I/O  interfaces,  and 

• a dual  port,  automatically  switching,  partially  duplicated 
memory. 

Each  of  these  features  is  described  in  detail  later  in  this  section. 

Partitioning 

The  pPT  consists  of  5 chips  (Figure  4.  4-4). 

a.  the  arithmetic  chip  which  contains  the  128  general  registers, 
the  arithmetic-logic  unit  (ALU),  the  multiply  network,  and 
related  functional  units; 

b.  the  sequencing  and  I/O  chip  which  contains  the  program  counter, 
the  interrupt  structure,  the  memory  accessing  hardware  and  the 
autonomous  I/O  interfaces; 

c.  2 memory  chips,  each  containing  4K  words,  16  bits  each; 

d.  the  microprogram  control  chip  which  contains  the  micropro- 
grammed control  unit  for  the  entire  pPT. 
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I/O  AND 
SEQ 


CONTROLS 


ARITH 


Figure  4.  4-4.  Partitioning  of  the  fj.PT. 

Detailed  Description  of  the  Arithmetic  Chip 

The  architecture  of  the  arithmetic  chip  is  shown  in  Figure  4.  4-5. 
Except  for  the  various  control  and  status  lines,  the  only  data  path  leading 
off  this  chip  is  the  Main  Bus  over  which  memory  data  will  be  transmitted 
and  received. 

Internally  the  chip  contains  two  busses:  the  Arithmetic  Bus  (A-Bus) 
which  accommodates  most  register  transfers  on  the  chip,  and  the  Iteration 
Counter  Bus  (I-Bus)  which  al'ows  selection  of  inputs  to  the  Iteration  Counter 
(I).  (The  I-Bus  may  subsequently  be  replaced  by  a multiplexer  if  that  is 
advantageous.) 

The  following  is  a description  of  the  various  functional  units  on  the 
arithmetic  chip. 


The  General  Registers 

A group  of  128  16  bit  registers  is  provided  for  the  user  program. 
These  registers  offer  fast  access  (1  cycle).  Since  the  registers  are  under 
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Figure  4.  4-5.  The  arithmetic  chip. 

the  control  of  the  user  program,  addressing  is  done  exclusively  from 
certain  fields  of  the  Instruction  Buffer  Register  (IBR).  Data  from  the  General 
Registers  can  be  sent  to  the  A- Bus  or  to  the  I- Bus.  The  General  Registers 
can  be  loaded  from  the  A- Bus. 

The  Constant  ROM  (CROM) 


This  read-only-memory  contains  certain  constants  and  masks  neces- 
sary in  the  interpretation  of  the  instruction  set.  The  CROM  is  addressed  by 
the  microprogram.  The  CROM  is  16  bits  wide  and  its  length  is  estimated  to 
be  1 6 words.  Data  from  the  CROM  can  be  se  nt  to  the  A- Bus  or  to  the  I- Bus. 


The  U and  V Registers 

These  two  registers  are  the  transfer  buffers  between  the  Main- Bus 

and  the  A- Bus.  Each  is  16  bits  wide.  The  IBR  can  be  loaded  from  the 

/* 

V-register. 


The  Iteration  Counter  (I) 


The  Iteration  Counter  is  used  in  the  implementation  of  iterative 
instructions,  such  as  shifts,  block  moves,  division,  etc.;  it  is  an  8-bit 
up-counter.  At  the  beginning  of  an  iterative  algorithm  I is  loaded  with  a 
negative  value  (-1  to  -128)  and  is  then  counted  up  until  the  zero  is  reached. 
Detection  of  the  zero  condition  is  thus  reduced  to  monitoring  of  the  sign  bit. 
The  Iteration  Counter  can  be  loaded  from  the  I-Bus.  For  shift  instructions, 
the  value  of  the  shift-count  field  of  the  IBR  is  transferred  to  the  Iteration 
Counter.  For  block  move  instructions  the  length  of  the  block  will  be  trans- 
ferred from  a General  Register  to  the  Iteration  Counter.  For  division  the 
initial  value  of  the  Iceration  Counter  will  be  transferred  from  the  CROM. 

The  Arithmetic-Logic  Unit  (ALU) 

The  ALU  has  two  16  bit  inputs  designated  A (left)  and  3 (right).  The 
output  of  the  ALU  is  one  of  the  following  functions  of  A and  B: 

A 

B 

A + B 
A - B 
A.  OR.  B 
A.  AND.  B 
A.  XOR.  B 
A 
-B 

all  0-s 
all  1 - s 

In  addition  to  the  16 -bit  result,  the  ALU  detects  overflow  for  the  operations 

A + B 
A - B 
-B. 

The  operation  to  be  performed  by  the  ALU  is  selected  by  4 control 
lines  from  the  microprogram. 
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The  propagation  delay  through  the  ALU  is  very  short  (under  4 nsec), 
thus  allowing  ample  time  for  storing  the  result  during  the  same  cycle. 

The  A,  X and  B Registers 

These  registers  are  all  16  bits  wide.  The  A and  B registers  serve 
as  the  A and  B input  to  the  ALU.  Both  can  be  loaded  from  the  A-Bus.  The 
X-register  is  used  in  division  and  for  shift  instructions  together  with  the 
A-register.  The  X-register  can  be  loaded  only  from  the  A-register  and 
the  contents  of  the  X-register  can  be  transmitted  directly  over  the  A-Bus. 

The  A and  X registers  can  be  shifted  at  the  rate  of  1 bit/cycle.  They 
can  be  shifted  right  or  left  as  one  unitv  For  left  shifts  the  carry-in  into  the 
right  end  of  the  X- register  is  controlled  by  the  Quotient  Bit  logic  network. 

For  right  shifts,  sign  extension  is  provided  on  the  left  end  of  the  A-register. 

The  Multiply  Netw -;rk 

The  Multiply  Network  facilitates  fully  parallel  multiplication  of  two 
16-bit  numbers.  The  result  is  valid  after  2 cycles  (50  ns).  The  Multiply 
Network  has  two  16-bit  inputs  (multiplicand  and  multiplier)  and  two  16-bit 
outputs  (most  and  least  significant  bits  of  the  result)  to  the  A-Bus. 

The  M and  N Registers 

These  two  registers  are  each  16  bits  wide  and  serve  to  hold  the 
multiplicand  and  multiplier  for  the  Multiply  Network.  The  N-register  can 
be  loaded  from  the  A-Bus  directly.  The  M-register  can  only  be  loaded 
from  the  N-register. 

The  Instruction  Buffer  Register  (IBR) 

The  IBR  holds  the  instruction  currently  being  processed.  It  is  16  bits 
wide  and  can  be  loaded  from  the  V-register. 

The  IBR  is  divided  into  fields  as  shown  in  Figure  4.4-6.  The  Accumu- 
lator pointer  field  is  used  to  address  one  of  the  first  eight  General  Registers 
(0  through  7).  The  General  Register  pointer  field  can  be  used  to  address  any 
General  Register.  The  shift  count  field  can  be  transmitted  to  the  Iteration 
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Counter  over  the  I- Bus.  The  right/left  indicator  bit  controls  the  A and  X 
registers'  shift  direction.  The  op-code  field  is  used  by  the  Microprogram 
Control  Unit  (MCU)  in  instruction  decoding. 

Flag  Generation  Logic 

This  logic  neuwork  monitors  the  value  on  the  A-Bus.  Three  flags 
are  generated: 

A-Bus  = 0 
A-Bus  > 0 
A-Bus  < 0. 

All  three  flags  are  used  by  the  MCU  for  branching. 

The  A-  Bus 

Table  4.4-1  summarizes  the  inputs  and  outputs  of  the  A-Bus. 


TABLE  4.  4-1.  INPUTS  AND  OUTPUTS  OF  THE  A-BUS 


Inputs 

Outputs 

General  Reg. 

A-  register 

CROM 

B -regi  ster 

ALU 

N-register 

Multiply  Net  MSB 

U -register 

Multiply  Net  LSB 

Flag  generation  logic"' 

V- register 

X-register 

7 inputs 

4+1  outputs 

Always  receiving. 

The  I- Bus 

Table  4.4-2  summarizes  the  inputs  and  outputs  c.f  the  I- Bus. 


TABLE  4.  4-2.  INPUTS  AND  OUTPUTS  OF  THE  I- BUS 


Inputs 

Outputs 

General  Reg. 

I-  Counter 

CROM 

IBR 

3 inputs 

1 output 

4.  4.  2b  The  Sequencing  and  I/O  Chip 

< 

The  architecture  of  the  sequencing  and  I/O  chip  is  shown  in  Fig- 
ure 4.  4-7.  The  following  data  paths  are  leading  off  this  chip: 

a.  Main  Bus  (16  bits) 

b.  Address  Bus  (13  bits) 

c.  Input  to  Track-Bus  (16  bits) 

d.  Two-way  to  neighbors  (8  bits) 

e.  Output  to  Signal  Processor  (1  bit) 

f.  Input  from  previous  Vector  Buffer  Coni.r  Pfir  (1  bit) 

g.  Output  to  subsequent  Vector  Buffer  Co.  . -dler  (1-bit) 

The  above  list  docs  not  include  control  and  status  lines.  The  Main- 
Bus  is  used  to  transmit  and  receive  memory  data.  The  Address  Bus  is  used 
to  send  memory  addresses  to  the  memory  chips.  The  Track-Bus  is  not 
under  the  control  of  the  pPT.  The  data  accumulated  in  the  Vector  Buffer  is 
periodically  sent  out  over  the  Track-Bus. 

Internally,  the  sequencing  and  I/O  chip  contains  another  bus,  the 
Program  Counter  Bus  (PC -Bus),  which  allows  selection  of  inputs  to  the 
Program  Counter  (PC).  (The  PC-Bus  may  subsequently  be  replaced  by  a 
multiplexer  if  that  is  advantageous .) 


The  sequencing  and  I/O  chip  can  be  functionally  subdivided  into 

4 parts; 

(1)  the  memory  addressing  function, 

(2)  the  interrupt  structure, 

(3)  sequencing  mechanism,  auu 

(4)  I/O. 

The  following  is  a description  of  these  various  functional  units  on 
the  sequencing  and  I/O  chip. 

( 1)  Memory  Addressing  Function 

There  are  three  sources  of  memory  addresses  in  the  pPT: 

• the  Program  Counter, 

• addresses  in  the  instruction  stream,  and 

• indirect  addresses. 

Addresses  in  the  instruction  stream  may  be  indexed. 

The  following  sections  describe  in  detail  how  memory  address  selec 
tion  is  implemented  in  the  pPT. 

The  Memory  Address  Register  (MAR) 

The  Memory  Address  Register  consists  of  a 13-bit  address  and  an 
"indirect"  bit.  They  correspond  to  the  13  I, SB  and  the  MSB  of  the  16  bit 
word,  respectively.  The  MAR  can  be  loaded  from  the  Main  Bus  only. 

The  Index 

The  Index  is  a 13-bit  register  used  to  hold  the  contents  of  one  of  the 
General  Registers,  number  1 through  7,  when  indexed  addressing  is  used. 
The  Index  is  loaded  from  the  Main-Bus  only. 
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The  Index  Adder 


The  Index  Adder  is  used  for  every  memory  access  where  the  address 
originates  from  the  MAR  (indexed  or  unindexed).  The  Index  Adder  can  pro- 
duce 2 possible  results: 

INDEX  + MAR 
or 

MAR. 

The  output  of  the  Index  Adder  is  13  bits  wide  and  can  be  transmitted  to  the 
memory  chips  via  the  Address-Bus. 

(2)  Interrupt  Structure 
The  Interrupt  Vector  (IV) 

The  IV  consists  of  a 7-bit  register  and  associated  logic  used  to  set 
the  register.  The  IV  can  be  loaded  from  the  Main-Bus.  The  principal  use 
of  the  IV  however  is  to  record  the  occurrence  of  any  of  7 interrupt  levels. 

A "1"  in  a certain  bit  position  of  the  IV  indicates  the  occurrence  of  that 
interrupt. 

The  Interrupt  Mask  (IM) 

This  7-bit  register  can  be  used  to  suppress  the  servicing  of  any 
interrupt  levels.  The  contents  of  the  IM  are  ANDed  with  IV  before  any 
further  decisions  are  made.  The  7-bit  number  thus  obtained  is  encoded 
as  shown  in  Table  4.  4-3  and  the  3-bit  code  thus  obtained  is  the  highest  cur- 
rent interrupt  level. 

The  Level  Register 

This  3-bit  register  holds  a value  equal  to  level  of  the  interrupt  being 
processed  currently.  When  no  interrupt  is  being  processed  (normal  pro- 
gram flow)  the  value  of  the  level  register  is  zero. 
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TABLE  4.4-3.  ENCODING  OF  INTERRUPT  LEVELS 


IV.  AND.IM 

Encoded  Value 

Decimal 

Binary 

0000000 

0 

000 

0000001 

1 

001 

000001X 

2 

010 

00001XX 

3 

Oil 

0001XXX 

4 

100 

001XXXX 

5 

101 

01XXXXX 

6 

no 

1XXXXXX 

7 

111 

Interrupts  occur  whenever  the  highest  current  interrupt  level  is 
higher  than  the  value  in  the  Level  register.  For  this  purpose  a comparator 
continuously  compares  the  value  in  the  Level  register  with  the  highest  cur- 
rent interrupt  level.  The  signal  (INT)  thus  generated  is  monitored  by  the 
Microprogram  Control  Unit. 

( 3)  Sequencing  Mechanism 
The  Program  Counter  (PC) 

The  PC  is  a 13-bit  up-counter.  It  can  be  loaded  from  the  PC-Bus. 

The  PC  contains  the  address  of  the  next  instruction  to  be  executed.  Addresses 
from  the  PC  can  be  transmitted  to  the  memory  chips  over  the  Address  Bus. 
The  contents  of  PC  can  also  be  pushed  onto  the  Interrupt  Stack  when  an  inter- 
rupt occurs. 

The  PC  can  be  loaded  from  various  sources  via  the  PC-Bus: 

• from  the  MAR  (for  branch  instruction), 

• from  a Trap  Address  Cell  (for  interrupts), 

• from  the  interrupt  Stack  (for  resumption  after  interrupts), 
and 

• from  a hardwired  bootstrap  address  (for  cold  starts). 
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The  Trap  Address  Cells  (TAC) 


There  are  7 Trap  Address  Cells,  each  corresponding  to  one  interrupt 
level.  Each  TAC  contains  a memory  address  and  whenever  the  respective 
level  of  interrupt  occurs,  interrupt  handling  is  started  at  that  address. 

The  TAC's  consist  of  7 registers,  each  12  bits  wide.  It  has  been 
mentioned  that  the  value  of  one  of  the  TAC's  can  be  sent  to  the  PC  over  the 
PC-Bus.  Selection  of  which  TAC  to  use  is  based  on  the  value  in  the  Level 
Register. 

The  TAC's  can  be  loaded  from  the  Main-Bus.  For  this  purpose  a 
15-bit  quantity  consisting  of  a 3-bit  TAC  select  code  and  a 12-bit  trap 
address  is  sent  to  the  TAC's.  Hardware  associated  with  the  TAC's  will 
load  the  proper  cell. 

The  Interrupt  Stack 

The  Interrupt  Stack  consists  of  7 cells  (corresponding  to  the  maximum 
possible  7 nested  interrupts). 

Each  cell  consists  of  a 12-bit  resumption  address  and  a 3-bit  resump- 
tion level.  Before  an  interrupt  is  serviced,  the  contents  of  PC  ar-d  of  the 
Level  register  are  pushed  onto  the  Interrupt  Stack.  When  the  servicing  of 
an  interrupt  is  completed,  the  value  from  the  top  of  the  Interrupt  Stack  is 
popped  off  and  placed  into  the  PC  and  Level  register. 

Pushing  and  popping  of  the  Interrupt  Stack  is  done  by  means  of  a Top 
of  Stack  Counter  (TSC)  which  is  used  to  address  the  seven  cells.  To  push, 
TSC  is  incremented  and  then  the  value  is  stored.  To  pop,  the  value  is  read 
off  the  stack  and  TSC  is  decremented. 

(4)  I/O 

Vector  Buffer  (VB-) 

The  Vector  Buffer  represents  the  I/O  interface  with  the  Track-Bus. 
The  VB  contains  one  cell  of  storage  for  each  track.  Currently  it  appears 
that  32  such  cells  will  be  sufficient.  Each  cell  will  be  composed  of  a 
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number  (n)  of  16-bit  words,  A controller  will  allow  uhe  following  functions  to 
be  performed: 

a.  Receive  a command  indicating  where  the  next  n words  are  to 
be  placed. 

b.  Receive  n 16 -bit  words  following  a command. 

c.  Receive  a pulse  from  another  pPT  indicating  that  all  valid 
data  in  VB  is  to  be  sent  out  on  the  Track-Bus. 

d.  Send  out  on  Track-Bus  all  valid  data  in  VB  and  mark  entire 
VB  as  empty  and  available. 

e.  Send  a pulse  to  another  |iPT  indicating  that  this  pPT  is  done 
using  the  Track-Bus. 

Threshold  Control  Register  (TCR) 

This  register  is  parallel  input/serial  output  organized.  It  represents 
the  I/O  interface  between  the  pPT  and  the  corresponding  Signal  Processor. 

Message  Control  Network 

The  Message  Control  Network  (MCN)  together  with  the  Incoming 
Message  Register  (IMR),  and  the  Outgoing  Message  Register  (OMR)  form 
the  I/O  interface  with  the  8 neighboring  pPT's. 

To  send  a message  to  a neighbor,  the  OMR  is  loaded  and  thereafter 
the  MCN  handles  I/O  while  the  processor  continues  executing  the  instruction 
stream. 

Incoming  messages  are  handled  hy  the  MCN  without  processor  inter- 
vention until  the  message  is  placed  into  the  IMR.  At  that  point  an  interrupt 
is  sent  to  the  interrupt  structure. 

The  Memory  Chip 

Before  discussing  the  organization  of  the  memory  chip  it  is  neces- 
sary to  restate  the  functions  of  the  memory  as  a whole.  A memory  of  7K 
words  of  16  bits  each  is  required.  The  first  5K  words  are  dedicated  to  store 
programs,  constants  and  variables.  The  next  IK  words  of  memory  comprise 
a special  purpose  memory  dedicated  to  the  task  of  acquiring  data  for  software 
temporal  filtering.  The  last  IK  words  are  dedicated  to  store  the  hits 
of  the  latest  frame  and  are  duplicated.  The  duplicate  memory  is  isolated 
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from  the  processor  address  space  and  is  loaded  with  hit  data  from  the 
Signal  Processor  over  the  alternate  memory  interface.  At  the  end  of  the 

frame  time  the  two  lK-memories  exchange  rolls. 

The  first  5K  words  can  also  be  loaded  from  the  alternate  interface. 

The  Data  and  Control  Processor  may,  at  its  discretion,  load  new  programs 
into  any  or  all  (iPTs.  A bit-serial  interface  is  provided  for  that  purpose  on 

the  alternate  interface. 

It  should  be  noted  that  while  accesses  to  the  memory  coming  from 
the  processor  are  random,  program  loading,  as  well  ^s  data  loading  from 
the  Signal  Processor,  are  both  sequential  in  nature. 

The  architecture  of  the  memory  chip  is  shown  in  Figure  4.  4-8. 

The  chip  was  designed  so  that  only  one  type  of  chip  need  be  developed.  Thus 
the  cost  of  developing  two  or  more  chip  types  is  eliminated. 


Figure  4.4-8.  The  memory  chip. 


The  Memory  Proper 


Within  the  nemory  chip  lies  the  memory  storage  area  and  memory 
controller.  The  storage  area  is  organized  as  4096  words  with  16  bits 
per  word.  The  memory  controller  is  the  logic  required  to  coordinate 
memory  operations,  such  as  Read/Write,  address  decode,  request  com- 
plete indication,  etc.  Thus,  this  chip  constitutes  a complete  memory 
unit. 

To  access  the  memory,  a predetermined  set  of  procedures  must  be 
followed.  The  signal  to  read  or  write  must  be  set  up  along  with  fch  t dress. 
Following  this,  the  enable  to  the  chip  is  activated.  The  memory  cnen  per- 
forms the  read  or  write  and  acknowledges  completion  of  the  task.  For  read 
operations  the  data  is  enabled  onto  the  main  bus  at  the  same  time  the  com- 
pletion signal  is  generated.  To  change  the  address  and  R/W  signals,  the 
chip  enable  must  be  disabled  to  protect  the  memory  contents  from  being 
destroyed. 

Chip  Identification 

Note  that  the  entire  address  (13  bits)  is  connected  to  all  memory 
chips.  However,  each  chip  contains  only  IK  of  address  space.  The  IK  space 
requires  10  bits  of  address,  and  the  I.  D.  scheme  is  as  follows:  The  three 

MSB  of  the  address  are  compared  to  the  chip  I.  D.  A match  implies  that  the 
desired  address  (13  bits)  lies  on  this  chip.  A mismatch  implies  that  this 
chip  does  not  contain  the  desired  address.  Hence,  each  memory  chip  will 
have  a unique  I.  D.  (with  the  exception  of  the  hit  memories).  Since  only  one 
chip  will  match  the  three  MSB  at  any  access,  no  conflicts  will  arise.  Also 
notice  that  the  chip  I.  D.  match  logic  is  tied  into  the  chip  enable  logic  to 
further  protect  the  memory  contents. 

Data  I/O 

There  exist  three  data  inputs  to  the  memory  chip  and  one  data  output. 
The  primary  I/O  channel  is  the  Main  Bus.  Data  sent  across  the  Main  Bus 
will  be  stored  if  so  desired,  or,  if  a read  was  requested,  data  will  be  put  on 
the  Main  Bus. 


The  two  other  inputs  constitute  the  alternate  interface.  One  data 
path  is  a 16-bit  parallel  input  and  is  used  for  signal  processor  data.  The 
other  path  is  a single  line  bit  serial  path  to  be  used  for  program  load.  The 
serial  input  is  converted  to  a parallel  format  on  the  chip  and  then  stored 
in  the  memory. 

Addressing 

Two  address  sources  are  present  to  the  memory,  the  Address  Bus 
and  the  Sequence  Counter  on  the  chip. 

The  Address  Bus  supplies  an  address  to  the  memory  chip  from  the 
CPU.  The  sequence  counter  is  used  when  one  of  the  write-only  ports  i?  being 
used.  Prior  to  data  transmission  the  counter  is  reset.  Each  data  word  is 
then  stored  at  an  incrementally  higher  address  as  specified  by  the  counter. 
The  external  system  controls  the  count  enable  line  to  the  Sequence  Counter. 

The  'T1  Flip  Flop 

The  1 T 1 flip-flop  controls  the  current  interface  to  the  memory  proper. 
In  the  zero  state,  the  primary  interface  to  the  main  bus  is  enabled  and  all 
signals  on  the  other  interface  are  ignored.  In  the  one  state,  the  secondary 
interface  is  active  while  the  primary  is  disabled. 

In  the  program  memory,  the  T is  set  prior  to  loading  a new  program 
and  is  under  the  control  of  the  DCP.  In  the  hit  memory,  the  T is  under  the 
control  of  the  unit  sending  data  to  the  hit  memory  and  will  be  toggled  at  the 
start  of  each  new  frame  of  data. 

The  Microprogram  Control  Unit  (MCU) 

The  architecture  of  the  MCU  is  shown  in  Figure  4.4-9.  The  follow- 
ing data  paths  are  lead’ng  off  the  MCU  chip: 

a.  Encoded  Control  Signals  to  all  other  chips  of  the  pPT. 

b.  Status  Flags  from  all  other  chips  of  the  pPT. 

c.  Op-code  from  IBR  on  Arithmetic  Chip. 

d.  INT-flag  from  Sequencing  and  I/O  chip. 


b.  6 bits  from  the  Op-code  field  of  the  IBR  with  three  zeros  as 
MSB's;  used  for  instruction  decoding;  or 

c.  a hardwired  address  used  to  branch  into  a section  of  the  micro- 
program which  is  dedicated  to  trapping  interrupts. 

Selection  between  these  sources  is  made  by  means  of  the  Address 
Multiplexer.  The  selection  is  based  on  three  control  bits;  one  is  the  INT 
signal  from  the  Interrupt  Structure;  the  others  originate  from  the  Command 
Register. 

Most  of  the  time  the  address  based  on  the  Next  Address  Field  is 
selected.  Only  once  in  the  execution  of  each  instruction  do  the  other  sources 
come  into  play.  Whenever  a new  instruction  has  to  be  decoded  the  INT-signal 
will  select  between  the  Op-code  (for  instruction  decoding  of  INT=0)  or  the 
interrupt  trap  address  (if  INT=1). 

The  Flag  Select  Multiplexer 

This  Multiplexer  allows  selection  of  one  of  the  many  status  flags 
from  the  pPT  in  order  to  be  appended  to  the  Next  Address  Field.  The  con- 
trols for  the  multiplexer  originate  in  the  Command  Register . 

The  Instruction  Repertoire  of  the  pPT 

The  pPT  has  an  instruction  set  consisting  of  40  instructions.  Most 
instructions  occupy  one  word  in  the  memory  but  some  are  doubleword 
instructions.  Indexing  and  indirect  addressing  is  available  on  some 
instructions. 

There  are  three  basic  instruction  formats  as  shown  in  Figure  4.4-10. 
These  are: 

• register  — register  (RR) 

• register  — memory  (RM) 

• register  — shift  (RS). 

Figure  4.4-11  presents  a summary  of  the  instruction  set.  The 
J following  paragraphs  describe  the  instruction  set  in  detail. 

| ‘ 

4-50 


i * lyffr,  ...  --  ^ 

.X  •>..  .v.L.  - - - - - | — — - - - — i i 

• — - 


IdlHS  TOIDOi 


C5 

M 

►o 

M 

co 

► J 

t > 

• 

• 

Ci 

o 

H 

X 

H 

;< 

3 

PJ 

p) 

U- 

u 

o 

1 

! 

n 

►•h 

o 

rH 

H 

H 

H 

>■} 

9i 

W 

►J 

y 

It 

M 

< 

< 

rd 

i: 

H 

u 

o 

pi 

H 

H 

< 

M 

HI 

u 

H 

M 

H 

o 

O 

H 

Pi 

crj 

O 

o 

o 

y 

< 

<d 

Pi 

>-) 

PH 

Pi 

m 

to 

!v 

CO 

H 

cd 

in 

5 

M 

ca 

5 

w 

ro 

o 

CO 

< 

Pi 

pj 

n 

% 

fed 

CO 

3 

H 

<? 

S-* 

g 

CEL 

H 

P< 

/ — j 

t--* 

10 

H 

H 

to 

c/.’ 

"T 

o 

CO 

CO 

CO 

w 

M 

<5 

y 

uo 

CO 

;o 

Cr, 

c 4 
o 


X 


H 

W 

y 

M 

o 

v < 

I 

El 

*< 

(-< 

Pi 

H 

U 

>* 
► — > 

O 

>1 

< 

P-< 

U 

p: 

« 

M 

h 

H 

H 

CO 

h 

Q 

ca 

r3 

O 

U> 

rH 

<\ 

•i. 

pa 

>■« 

o 

a 

o 

a 

pa 

t-i 

p-< 

H 

pa 

CO 

a 

n 

►H 

P 

y 

> 

o 

o 

PI 

w 

c: 

M 

o 

cd 

O 

CO 

Pi 

c/ 

w 

U'l 

pH 

CJ 

p« 

o 

Pi 

rd 

Cd 

w 

CH 

O 

o 

CO 

Cd 

& 

o 

C 

ti: 

o 

• 

H 

o 

HH 

[H 

> 

o 

M 

CO 

M 

to 

;o 

P 

=L 

5*“I 

o 

o 

o 

O 

0) 

H 

f-H 

H 

pi 

si 

4-> 

H 

r-H 

H 

H 

H-l 

P3 

CD 

O 

CH 

P-. 

P< 

H 

h-H 

H 

co 

4-> 

£-> 

d 

a; 

O 

o 

a 

>— i 

CD 

o 


u 

3 

u 

-u 

CD 

C 

■ iH 

D 

Si 

H 


[0 

H 

«V 


C; 

i } 

U 

o 

» * 

y 

r;> 

Cu 

c; 

pi 

C‘\ 

0 • 
rH 

*T^ 

p: 

t'.j 

Jx* 

I-.J 

-rS 

i h 

t‘4 

C?3 

O 

t.o 

if* » 

o 

r 

H 

H 

r-c 

‘ * 

*/ 

, »* 

{.,  (1J  y 

c> 

» ; 

• 

• 

r:i 

l-' 

, •->  i j 

o 

l^‘  ■ 

a 

• r. 

d ('•'•  :• : 

r i 

h 3 

to 

O 

* 

t A 

, l tn  !> ! 

ro 

fO 

pi 

■ 3 

O 

•-i 

f;' 

VV  H 
t r, 
P-i 

i-  ( i i 


0) 

U 

3 

bo 

p 


r/'l 

PI, 


n: 

U 


r-o 


a 

CP 

O 

O 

o 

r.) 

«*;' 

C 3 

-i 

t_) 

< 

.o 

>‘H 

t » 

r 

r 

A 

V 

cJ 

f-H 

C 

tv 

f.V 

-'t; 

c> 

<■  1 
p- 

p . 

[ ) 

bl 

tj 

i f 

Lie 

4-1 

L'.. 

»*H 

Jo 

' ] 

if. 

o 

V— 1 

CO 

rd 

p 


4-52 


- - •-■■* kii . . . . 


,■  ti'h,.:..,  :r 


(1)  Data  Moving  Instructions 


Load  (RM) 

The  contents  of  the  memory  address  indicated  in  the  second  word  of 
the  instruction  are  loaded  into  the  General  Register  specified  by  REG.  If 
the  ACC  is  not  zero,  indexing  will  be  performed  using  Uie  General  Register 
indicated  by  ACC  as  the  index.  Indirect  addressing  can  be  specified  by 
placing  a 1 1 ' in  the  most  significant  bit  of  the  second  word  of  the  instruction. 

Store  (RM) 

The  contents  of  the  General  Register  specified  by  the  REG  field  are 
stored  in  the  memory.  The  memory  address  is  computed  as  for  LOAD. 

Exchange  (RM) 

The  contents  of  the  General  Register  indicated  by  REG  are  exchanged 
with  the  contents  of  the  memory  location  specified.  The  memory  address  is 
computed  as  for  LOAD. 

Block  Load  (RM) 

A block  of  data  from  consecutive  memory  locations  is  loaded  into 
consecutive  General  Registers.  The  beginnings  at  the  data  in  memory  is 
specified  by  the  address  in  the  second  word  of  the  instruction,  which  may 
be  an  indirect  address.  Indexing  is  not  available.  The  REG  field  specifies 
the  General  Register  where  loading  is  to  begin.  The  right  half  of  the  Gen- 
eral Register  indicated  by  ACC  contains  the  length  of  the  data  block  to  be 
moved.  The  length  of  the  block  must  be  between  1 and  256.  A length  of 
256  is  indicated  by  0.  (The  left  8 bits  of  the  General  Register  are  ignored.  ) 

Block  Store  (RM) 

Like  Block  Load  except  the  data  is  stored,  not  loaded. 

Load  Immediate  (RM) 

Unlike  other  RM-format  instructions,  the  second  word  of  the  instruc- 
tion here  is  data  rather  than  an  address.  The  data  from  the  second  word 


is  loaded  into  the  General  Register  specified  by  the  REG  field.  The  ACC 
field  is  not  used. 

Register  to  Accumulator  (RR) 

The  contents  of  the  General  Register  specified  by  the  REG-field  are 
copied  into  the  General  Register  specified  by  the  ACC-field. 

Accumulator  to  Register  (RR) 

The  contents  of  the  General  Register  specified  by  the  ACC-field  are 
copied  into  the  General  Register  specified  by  the  REG-field. 

(2)  Arithmetic  Instructions 

All  arithmetic  operations  assume  a fractional  2's  complement  number 
system.  The  binary  point  is  implied  between  the  sign-bit  and  the  bit  to  its 
right: 

S.  XXXX.  . . X. 

Add  (RR) 

The  contents  of  the  General  Register  specified  by  ACC  are  added  to 
the  contents  of  the  General  Register  specified  by  REG  and  the  result  is 
placed  into  the  General.  Register  specified  by  REG.  Overflow  may  occur. 

Subtract  (RR) 

The  contents  of  the  General  Register  specified  by  ACC  are  subtracted 
from  the  General  Register  specified  by  REG  and  the  result  is  placed  into  the 
General  Register  specified  by  REG.  Overflow  may  occur. 

Multiply  (RR) 

The  contents  of  the  General  Registers  specified  by  REG  and  ACC  are 
multiplied  and  the  Jouble  precision  product  consisting  of  two  signed  16-bit 
numbers  is  placed  into  the  General  Register  specified  by  REG  and  into  the 

one  immediately  after  it. 
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Division  (RR) 

The  double  precision  number  from  the  General  Register  specified 
by  REG  and  the  one  immediately  after  it  are  divided  by  the  contents  of  the 
General  Register  specified  by  ACC.  The  quotient  is  a single  precision  num- 
ber and  is  placed  into  the  General  Register  specified  by  REG.  Overflow  may 
occur. 

Increment  (RR) 

The  value  of  the  General  Register  specified  by  REG  is  incremented. 
The  ACC  field  is  not  used.  Overflow  may  occur. 

Decrement  (RR) 

The  value  of  the  General  Register  specified  by  REG  is  decremented. 
The  ACC  field  is  not  used.  Overflow  may  occur. 

(3)  Logical  Instructions 

And  (RR) 

The  contents  of  the  General  Registers  specified  by  REG  and  ACC  are 
ANDed  and  the  result  is  placed  into  the  General  Register  specified  by  REG. 

Or (RR) 

The  contents  of  the  General  Registers  specified  by  REG  and  ACC  are 
ORed  and  the  result  is  placed  into  the  General  Register  specified  by  REG. 

Exclusive  Or  (RR) 

1 

The  "exclusive  or"  of  the  contents  of  the  General  Register  specified 
by  REG  and  ACC  is  computed  and  the  result  is  placed  into  the  General 
Register  specified  by  REG. 

1 1 s Complement  (RR) 

The  contents  of  the  General  Register  specified  by  REG  are  l's  com- 
plemented. The  ACC  field  is  not  used. 

L-j 
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(4)  Branch  Instructions 
Branch  Unconditionally  (RM) 

The  next  instruction  to  bo  executed  it  the  one  specified  by  the 
address  in  the  second  word  of  the  branch  instruction.  This  address  may  e 
indirect.  The  REG  and  ACC  are  not  used. 


Branch  if  Register  is  Zero  (RM) 

The  branch  is  taken  if  the  General  Register  specified  by  the  REG 
field  contains  zero.  Otherwise  the  instruction  following  the  branch  instruc- 
tion is  executed  next.  The  ACC-field  is  not  used. 

Branch  if  (REG)  = (ACC)  (RM) 

The  branch  is  taken  if  the  contents  of  the  General  Registers  specified 
by  REG  and  ACC  are  equal. 


Branch  if  (REG)  > (ACC)  (RM) 

The  branch  is  taken  if  the  contents  of  the  General  Register  specified 
by  REG  are  greater  than  the  content,  f the  General  Register  specified  by 

ACC. 

Branch  if  (REG)  < (ACC)  (RM) 

The  branch  is  taken  if  the  contents  of  the  General  Register  specified 
by  REG  are  less  than  the  contents  of  the  General  Register  specified  by  ACC. 

Increment  IACC)  and  Branch  if  (ACC)  i (REG)  (RM) 

This  instruction  allows  easy  implementation  of  DO-loops.  The  con- 
tents of  the  General  Register  designated  by  ACC  are  incremented.  The 
branch  is  taken  unless  the  new  value  of  (ACC)  is  greater  than  (REG). 

Branch  if  Overflow  (RM) 

The  branch  will  be  taken  if  the  instruction  immediately  preceding 
the  branch  instruction  was  an  add,  subtracter  divide  instruction  and  if  an 
overflow  had  occurred. 
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(5)  Shift  Instructions 


Arithmetic  Shift  (RS) 

The  contents  of  the  General  Register  specified  by  ACC  are  arithmetically 
shifted  the  number  of  bits  indicated  by  SC,  to  the  right  or  to  the  left  depending 
on  the  state  of  the  R/L  bit. 

Doubleword  Arithmetic  Shift  (RS) 

The  doubleword  contained  in  the  General  Register  specified  by  ACC 
and  the  one  immediately  after  it  are  arithmetically  shifted  as  specified  by 
SC  and  R/L. 

Rotate  (RS) 

The  contents  of  the  General  Register  specified  by  ACC  are  rotated 
as  specified  by  SC  and  R/L. 

Shift  Logical  0 -Extended  (RS) 

The  contents  of  the  General  Register  specified  by  ACC  are  shifted 
as  specified  by  SC  and  R/L,  and  the  vacated  bit  positions  are  filled  with 
zeros . 

Shift  Logical  1 -Extended  (RS) 

The  contents  of  the  General  Register  specified  by  ACC  are  shifted 
aa  specified  by  SC  and  R/L,  and  the  vacated  bit  positions  are  filled  with  ones. 

(6)  I/O  Instructions 

Output  to  Vector  Buffer  (RR) 

A track  data  item  consisting  of  (TBD)  words  is  moved  from  the 
General  Registers,  beginning  with  the  one  specified  by  REG,  to  the  Vector 
Buffer  for  output  onto  the  Track  Bus.  ACC  is  net  used. 
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Output  to  Signal  Processor  (RR) 


A threshold  control  block  consisting  of  (TBD)  words  is  moved  from 
the  Geneial  Registers,  beginning  with  the  one  specified  by  REG,  to  the 
Threshold  Control  Register  and  the  output  operation  is  initiated.  ACC  is 
not  used. 

Output  to  Neighbor  (RR) 

One  outgoing  message  block  consisting  of  (TBD)  words  is  moved 
from  the  General  Registers,  beginning  with  the  one  specified  by  REG,  into 
the  Outgoing  Message  Register  and  the  output  operation  13  initiated.  The 
ACC  field  is  not  used. 

Input  from  Neighbor  (RR) 

One  incoming  message  block  consisting  of  (TBD)  words  is  moved 
from  the  Incoming  Message  Register  to  the  General  Registers,  beginning 
with  the  register  specified  in  REG.  ACC  is  not  used. 

(7)  Interrupt  Handling  Instructions 

Set  Trap  Address  (RR) 

This  instruction  allows  the  setting  of  the  Trap  Address  Cell  (TAC) 
for  a certain  interrupt  level.  The  contents  of  the  General  Register  specified 
by  REG  must  be  as  follows; 

• bit  0-2  contain  a level  number  between  1 and  7, 

• bits  4-15  contain  a 12-bit  address  which  will  be  written  into 
the  TAC, 

• bit  3 is  not  used. 

The  ACC  field  is  not  used. 

Set  Mask  Register  (RR) 

The  least  significant  7 bits  of  the  General  Register  specified  by  REG 
are  moved  into  the  Mask  Register  in  the  Interrupt  Structure.  ACC  is  not 


Resume  (RR) 

Execution  of  the  Resume  instruction  causes  the  Interrupt  Stack  to  be 
popped  and  the  value  of  the  PC  and  of  the  Level  Register  to  be  restored. 
REG  and  ACC  are  not  used. 

(8)  Miscellaneous  Instructions 


Halt  (RR) 

This  instruction  halts  the  machine  until  an  external  interrupt  starts 
it  again. 

No-op  (RR) 

This  instruction  is  the  null -ope  rati  on. 

|'i 

Reset  and  Start  (RR) 

Resets  the  state  of  the  machine  to  the  initial  state  (TBD)  and  starts 
execution  at  the  hardwired  bootstrap  address. 


5.  0 APSP  SOFTWARE 


This  section  contains  a general  discussion  of  tracking  techniques, 
followed  by  a discussion  of  two  algorithms  — one  for  tracking  below  the 
horizon  (BTH)  and  the  other  for  star  rejection  above  the  horizon  (ATA). 

5.  1 Tracking  Techniques 

There  are  two  tracking  techniques  proposed  for  consideration  and 
flow  diagrams  are  given  i.n  Figure  5.  1-1.  A brief  comparison  of  the  relative 
advantages  and  disadvantages  is  given  in  Table  5.  1-1.  The  first  method 
operates  basically  in  a sequential  (real  time)  manner  so  that  one  scan  of  data 
is  processed  at  a time.  The  second  method  would  operate  on  three  scans  of 
data  when  considering  existing  tracks  and  would  combine  the  detection  and 
estimation  processes. 

5.  2 Technique  1 

The  elements  of  this  technique  are  basically  those  that  h' ve  classically 
been  associated  with  the  multi-target  track  problem.  The  processing  is 
accomplished  in  real  time  so  that  state  variable  updates  are  obtained  at  the 
end  of  each  computational  frame. 

Peak  Detection 

Peak  detection  could  be  used  to  improve  the  estimate  of  target  posi- 
tion at  the  sampling  interval.  This  should  lead  to  some  improvement  over 
the  alternative  of  placing  the  target  measurement  at  the  center  of  the  pixel 
where  a detection  occurred.  The  tradeoff  between  the  improvement  in 
measurement  and  the  required  computational  complexity  should  be 
investigated. 
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Figure  5. 


Figure  5 


1-la.  Flow  chart  of  tracking  technique  1. 


. 1-  lb.  Flow  chart  of  tracking  technique  2 
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TABLE  5.  1-1.  METHOD  COMPARISON 


Method 

Advantage(s ) 

Disadvantage^ ) 

1 

1.  Least  computational  complexity 

Lower  bound  on 
performance 

2.  Closest  to  "classical"  method 

for  multi-target  track 

2 

Best  performance 

Most  computational 
complexity 

Redundancy  Elimination 

During  one  scan  period  the  same  target  may  produce  detections  in 
more  th  u.  one  pixel.  Thus,  to  reduce  measurement  error  and  to  reduce  the 
probability  of  more  than  one  track  being  initiated  on  the  same  target, 
redundancy  elimination  logic  is  required.  This  would  involve  some  type  of 
simple  space  centroiding  of  observations  received  on  adjacent  pixels. 

Association  and  Correlation 

Standard  association  and  correlation  algorithms  employ  the  nearest 
neighbor  technique.  For  ...  single  track,with  gates  not  overlapping  those  of 
any  other  track,  this  merely  involves  finding  the  observation  with  the  mini- 
mum normalized  distance  from  the  predicted  track  position.  More  complex 
conflicting  situations  may  occur  when  track  gates  overlap  and  one  or  more 
observations  are  received  in  the  region  of  overlap.  For  this  more  complex 
situation  the  use  of  a correlation  matrix  with  simplified  solutions  to  the 
classical  assignment  problem  is  employed. 

Track  Initiation  and  Detection 

Observations  which  are  not  associated  with  existing  tracks  are  used 
to  initiate  new  tentative  tracks.  Then,  an  additional  criterion  is  typically 
required  for  a new  track  to  become  confirmed.  Tracks  are  deleted  when 
poor  quality  or  no  observations  are  received  for  update.  The  deletion  cri- 
terion is  more  difficult  to  satisfy  for  confirmed  tracks.  Performance  for 
the  presently  suggested  initiation/deletion  algorithms  is  given  in  Section  6.3. 
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T rack  Up d vte  and  Prediction 


New  observations  are  incorporated  in  the  track  and  an  updated  state 
variable  estimate  is  formed.  Possible  candidates  for  tracking  filters  in 
position  are  the  constant  coefficient  <t-fS  and  a-S- V trackers  or  a two-  or  three- 
state  Kalman  filter.  Section  6.  3 provides  performance  for  the  a-(3  tracker  for 
hoth  aircraft  and  missiles.  The  effects  of  crossing  targets  is  also  consrdered. 


Gate  Generation 

Gates  are  formed  around  the  target's  predicted  state  variable  estimate. 
Only  those  observations  found  within  the  gate  aie  considered  (in  the  associa- 
tion and  correlation  algorithm)  for  potential  track  update. 

5.  3 Technique  2 

The  second  technique  employs  the  use  of  a limited  form  of  batch 
processing.  This  would  be  done  in  the  gated  region  of  the  predicted  target 
state  variable  ax  d would  replace  the  redundancy  elimination  and  <he  associa- 
tion and  correlation  functions  required  for  Technique  1.  Also,  this  method 
provides  a velocity  measurement  so  that  a higher  order  filter  will  be 

appropriate. 

Batch  Processing  for  Tracking 

This  tracking  technique  involves  taking  in  "all"  the  MPFA  data  over 
some  extended  period  of  time  and  performing  batch  processing.  The  dis- 
advantage of  this  technique  is  the  inordinate  amount  of  computational  capa- 
bility required  and  the  significant  delay  in  availability  of  data.  For  these 
reasons,  the  multiple-target  track  problem  is  commonly  accomplished  in 
real  time  as  discussed  in  Technique  1.  However,  by  doing  restricted  (in 
time)  batch  processing,  followed  by  a form  of  the  real  time,  multiple-target 
tracking  algorithms,  it  is  possible  to  obtain  some  of  the  advantages  of  both 
batch  and  real  time.  This  ir  accomplished  by  doing  batch  processing  over 
a small  number  of  pixels  (say  3)  and  then  doing  real  time  tracking  at  reduced 
data  rates.  The  track  predicted  values  are  used  to  restrict  the  amount  of 
batch  pi  ocessing  as  indicated  below. 
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Equations  for  Batch  Processing 

An  illustravtive  batch  track  filtering  algorithm  can  be  defined  by 
the  following  equations: 


Fm,n,i,j,t  " Pi-m,  j -n,  t-1  “ W1  Pi-m,j-n,t  ' W2  Pi-m,j-n,t+l 


1/2  P.  . . . + P.  . , - 1/2  P.  . . , 
i » j » t - 1 i * j * t 1 * j » t+ 1 


w P - W P 

w 2 i+m,  ji-n,  t-1  1 i+m,  j+n,  t 


+ P. 


i+m,  j+n,  t+1 


where 


F . . = filter  output  at  time  - t 

m,  n,  l,  j , t 


Pi,j  ■ Di.j-WE«Di-l,j+Di+l,J  + Di,j-l+Dl,j+l 

•WD<Di-l.j-l+Di-l.j+l  + Di+l,j-l  + Di+l.j+l> 


This  equation  provides  nine  different  filters  (m,  n - 0,  ±1)  needs  only  three 
frames  of  data  storage  (the  Ps  at  three  different  times)  and  automatically 
performs  the  redundancy  elimination  function  of  Technique  1,  Here 

D is  data  from  the  sensor, 
i and  j are  integer  pixel  indexes, 
t is  an  integer  time  sample  index, 

W„,  W„,  W,,  W-,  are  filter  constants. 

E D 1 £ 

The  filter  is  illustrated  in  Figure  5.  3-1.  Such  a filter  would  be 
velocity  sensitive  and  thus  would  potentially  generate  more  information 
than  a filter  using  time  and  space  independently.  Figure  5.  3-2  shows  the 
average  speed  response  of  such  a three-dimensional  filter  tuned  to  targets 
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SPEED  (PIXtLS/rRAMc) 

f| 

U Figure  5.  3-2.  3D  filter  response. 

moving  in  the  i-direction  at  a speed  of  1 pixel/frame.  Figure  5.  3.  3 shows 
the  average  directional  response  of  the  same  filter,  0°  being  the  x-direction. 
In  practice,  the  x-direction  lies  along  the  target's  predicted  path  and  the  time 
dependent  filter  weights  are  based  upon  the  extrapolated  covariance  estimates 
from  the  tracker.  Since  the  error  estimates  increase  with  time,  some 
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Figure  5.3-3.  3D  filter  response. 


performance  degradation  occurs  which  degrades  the  natural  performance 
improvement  of  batch  over  real  time  and  restricts  the  amount  of  batch 
processing  (using  a single  batch  filter)  that  can  be  accomplished  with  this 
technique.  It  is  noted  that  for  effective  usage  of  this  technique  either 

1.  Both  position  and  velocity  must  be  obtained  from  the 
batch  filter.  The  tracker  data  rate  is  then  reduced 
and  a set  of  equations  similar  to  that  above  is 
reeded. 

2.  Only  position  information  can  be  used  by  the  overlapping  batch 
processing  window  and  not  reducing  the  track  data  rate. 

Track  Update  and  Prediction 

The  proposed  batch  measurement  process  will  provide  a velocity 
estimate.  Thus,  the  filter  should  be  designed  to  include  this  additional 
measurement.  Again  standard  fised-coefficient  and  Kalman  filtering 
algorithms  j.re  available.  For  example,  me  form  of  the  fixed-coefficient 
filter  incorporating  the  velocity  measurement  is: 


Xs(n) 

Xs(n) 

X(n) 

Xp(n  + 1) 

X (n  + 1) 
P 


X (n)  + a1  [Xo(n)  - Xp(n)]  , 
Xp(n)  + a2  [Xo(n)  - Xp(n)]  , 
X(n  - 1)  +JL[Xo(n)  - Xp(n)] 
Xs(n)  + T Xs(n)  + -j-  X(n)  , 
Xs(n)  + T Xg(n)  , 


where 


subscripts  o,  p,  and  s refer  to  observed,  predicted  and  smoothed 
quantities . 


5.  4 APSP  Software 
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T r acking  Algorithm  Development 

The  fundamental  aspects  of  the  APSP  multi-target  track  problem  are 
track  initiation,  gating,  association  and  correlation  of  observations  and  tracks, 
updating  of  existing  tracks,  prediction  of  next  observation,  and  track  deletion. 

Two  preliminary  algorithms,  one  for  tracking  and  another  for  star 
rejection  have  been  designed  for  these  functions  and  will  be  discussed  below. 

It  is  important  to  minimize  the  number  of  false  tracks.  Thus,  the 
proposed  track  initiation  and  gating  routines  provide  safeguards  against  the 
initiation  and  maintenance  of  a track  on  observations  which  are  inconsistent 
with  the  expected  maximum  target  velocity,  maximum  acceleration,  inten- 
sity and  rate  of  change  of  intensity.  In  addition,  a waveform  discrimination 
routine  has  been  developed.  Correlation  between  the  positive  and  negative 
peaks  of  the  point  target  impulse  response  can  be  used  to  discriminate  point 
targets  from  extended  clutter.  Also,  a sophisticated  algorithm  which  utilizes 
the  vector  information  is  used  for  the  deletion  routine. 


Tracker  Software  Implementation 

Figure  5.  4-1  shows  a top  level  functional  How  diagram  of  the  program 
used  for  target  tracking.  Each  of  the  functional  blocks  is  discussed  below. 

The  program  commences  at  START  every  frame  time  (~0.  1 sec). 

At  the  START  of  the  program,  the  detected  hits  over  the  previous  frame  are 
assumed  to  be  stored  in  an  observation  file  consisting  of  pixel  coordinates 
(x*  y*),  and  amplitude,  A*,  for  each  hit.  The  first  task  performed  is  the 
initiation  function  which  presets  various  tables  and  counters  to  required 
values.  The  track  index,  i,  is  initialized  to  1.  The  tracks  are  numbered 
from  1 to  N.,AV,  with  i being  the  track  index.  At  any  one  time,  N tracks 
are  valid  or  active  and  are  inactive.  Each  valid  track  has  associ- 

ated with  it  a track  file  consisting  of  the  state  vector  and  other  information 
for  that  track.  Inactive  track  files  are  available  for  new  tracks.  For  invalid 
tracks,  the  main  part  of  the  program,  track  update,  is  skipped,  as  shown 
in  the  flow  diagram. 
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For  valid  tracks,  all  of  the  observations  are  scanned  to  see  which,  if 
any,  are  within  the  gate  of  the  track  presently  under  consideration.  A genera 
ized  distance  function  can  be  used,  as  shown,  and  the  size  and  shape  of  the 
gate  can  be  different  for  each  track.  For  each  track  a table,  J,  containing 
the  index  values,  j,  of  observations  that  correlate  with  that  track,  is  com- 
piled. Typically  zero,  one,  or  two  observations  will  correlate  with  a given 
track. 

Next,  the  confirmation,  extrapolation  or  deletion  algorithm  is 
executed.  A new  or  tentative  track  will  be  confirmed  if  jd  hits  are  received 
on  the  first  k frames.  Otherwise  the  track  is  deleted.  A track  will  be 
extrapolated  if  no  hits  are  received  on  this  frame.  For  a track  that  has 
rea_hed  confirmation,  deletion  will  occur  in  the  event  of  m consecutive 
frames  of  extrapolation.  For  extrapolating  tracks  the  observational  innova- 
tions AX,  AY,  and  AA  are  left  unchanged. 

For  each  observation,  J(r),  that  correlates  with  track  i,  a weighting 
function,  p(r),  is  computed  denoting  the  quality  of  that  observation.  Next  the 
p(r)'s  are  used  to  compute  innovations  for  each  dimension  by  summing  the 
products  of  the  p(r)'s  and  the  corresponding  observational  residuals  as 
shown. 

In  the  next  block,  the  total  change  in  the  track  i state  vector  is  com- 
puted using  the  innovations  weighted  by  the  a,  p gain  factors.  Reasonablene sf 
checks  can  be  performet  on  the  deltas  and  large  enough  values  can  result  in 
track  deletion.  For  rea.  onable  deltas,  the  track  i state  vector  is  updated  in 
the  next  block.  Then  thf  »,  P gain  factors  are  selected  for  use  in  the  next 
frame.  The  maximum  p(r)  value  is  first  found,  and  one  of  three  sets  of  a,  p 
values  is  selected  based  on  the  value  of  p(r 

Using  the  updated  state  vector,  the  track  i is  next  checked  for  having 
crossed  the  boundary  of  the  tracker's  domain.  For  confirmed  tracks  within 
the  boundary,  the  track  file  for  track  i is  transferred  to  the  Vector  Buffer 
to  be  output  on  the  Track  Bus.  The  program  then  proceeds  to  the  next 
iteration  as  shown. 

For  tracKs  crossing  the  boundary,  a special  algorithm  must  be 
executed  to  determine  which  of  the  eight  neighboring  trackers  should  receive 
the  track  and  to  what  state  vector  the  new  tracker  should  be  initialized.  One 


technique  for  accomplishing  this  is  diagramed  in  Figure  5.4-2  and  employs 
the  following  strategy: 

1.  Extend  the  (x,  y)  coordinates  of  the  tracker  beyond  the  boundaries 
of  its  domain  and  continue  to  keep  a track  file  on  tracks  that, 
cross  the  boundary  using  extrapolation  without  new  observations. 

2.  Continue  to  extrapolate  the  track  until  its  predicted  position  can 
place  it  uniquely  in  one  of  the  eight  adjacent  trackers.  For 
example,  referring  to  the  figure,  the  track  should  be  assigned  to 
neighbor  number  8 if 

(x  > 64  + 6 ) and  (-64  < y £ +64) 

where 

6 = Gap  width  (in  pixel  units)  + 1 pixel 

3.  Once  the  neighbor  receiving  the  track  has  been  determined, 
coordinate  conversion  into  the  new  tracker's  (x,  y)  frame  can  be 
pe  rformed . 

Figure  5.4-2  shows  the  domain  of  a tracker  and  portions  of  the 
domain  of  the  eight  nearest  neighbors.  In  the  first  part  of  the  flow  chart  the 
determination  of  the  boundary  condition  is  shown.  For  tracks  that  have 
crossed  the  boundary,  a search  for  conditions  that  put  the  track  in  one  of  the 
neighbors  is  made.  If  no  such  conditions  are  found,  the  track  is  extrapolated. 
If  a unique  neighbor  is  found,  the  state  vector  of  track  i is  transferred  to 
that  neighbor  after  coordinate  conversion.  Track  i is  then  deleted  from  the 

tracker  from  which  it  was  transferred. 

After  iterating  through  all  i values  and  updating  all  valid  tracks,  the 

program  proceeds  to  install  the  new  tracks,  if  any.  New  tracks  first  are 
created  from  all  those  observations  that  did  not  correlate  with  any  existing 
tracks.  Next,  all  tracks  that  were  handed  over  from  neighbors  are  installed 
as  new  tracks.  All  new  tracks  are  tentative  and  must  pass  the  confirmations 
criterion  before  becoming  full-fledged  tracks.  Gates  for  tentative  tracks, 
particularly  crossover  tracks,  can  be  larger  than  in  the  steady  state. 

In  the  following  block,  the  threshold  value  for  the  tracker  is  adjusted. 
Possible  criteria  could  be  total  number  of  observations,  the  number  of  active 
tracks  and/or  the  rate  of  change  of  the  number  of  active  tracks. 
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Finally,  self  test  functions  can  be  performed  in  the  time  remaining 
in  the  frame.  The  monitoring  process  will  be  interrupted  by  the  frame  clock 
which  will  direct  the  program  back  to  START  at  the  beginning  of  the  next 
frame . 

A significant  part  of  the  tracking  software  has  been  refined  to  the 
point  that  tentative  instruction  counts,  storage  requirements  and  execution 
times  can  be  obtained.  Figure  5.4-3  thru  5,4-7  show  fhe  flowcharts  for  the 
respective  parts  of  the  BTH  tracking  software.  Table  5.4-1  gives  a detailed 
account  of  storage  requirements.  Table  5.4-2  shows  the  number  of  instruc- 
tions and  the  number  of  operations  (instruction  executions)  necessary  to 
implement  the  flow  charts  shown  in  the  figures.  Also  in  Table  5 4-2,  the 
execution  time  for  this  part  of  the  program  is  computed  under  certain  plausi- 
ble assumptions  and  is  found  to  be  under  0.  02  seconds.  Considering  that  the 
typical  frame  time  is  0.  1 second  and  that  the  0.02  second  execution  time 
represent  the  processing  time  for  a very  significant  part  of  the  total  tracking 
software  (certainly  more  than  20%,  and  probably  more  than  50%),  it  can  be 
concluded  that  one  pPT  will  be  able  to  handle  tracking  for  one  MFPA  chip 
easily.  It  may  even  be  possible  to  let  one  pPT  handle  several  MFPA  chips. 

Star  Discrimination  Algorithms 

The  second  algorithm  used  in  the  microprocessor-tracker  eliminates 
the  star  background.  Qtilization  of  a staring  sensor  for  tracking  space 
objects  against  the  moving  background  star  field  (Figure  5.4-8)  points  up 
three  problem  areas  for  the  signal  and  track  processors: 

1.  The  number  of  detectable  stars  is  a function  of  sensor  sensitivity 
and  can  be  considered  (see  Table  5,4-3) 

2.  With  the  satellite  in  synchronous  orbit,  the  star  field  appears  to 
be  moving  at  a rate  2.  9 pixels/sec.  This  motion  must  be  com- 
pensated for  in  order  to  cancel  star  detections. 

3.  The  signal-to-noise  ratio  (S/N)  of  a typical  satellite  target  can 
be  less  than  unity  (e.  g,  , for  RVs,  S/N  may  be  less  than  0.  5). 

This  complicates  th  track  initiation  problem  unless  some  time- 
delay  and  integration  (TDI)  is  applied. 


For  (i  =1,N);  (j  = 1,  M) 


Do: 

Dx  = X*(j)  -X(i) 

Dx  - Y*(j)  - Y(i) 

Dz  = A*(j)  -A(i) 

If  (D  »C  ) or  (D  >C  ) or  (D  >C  ) 

' x y'  ' y y a y' 

Then  D(i,j)  = +of  a very  large  value) 
D 2 D 2 D 2 

Else  D(e,  j ) = 

____ x x a 


BLOCK  3 - COMPUTE  GATE  FILE  p'  (r>,  plr> 


tl 


Figure  5.  4-5.  BTH  block  3. 
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BLOCK  4 - TRACK  UPDATL 

1.  COMPUTE  WEIGHTED  INNOVATIONS 

2.  TRACK  UPDATE  USING  a-0  TRACKER 


SET:  AX(i)  • 0,  AYIII  • 0,  AAII)  ■ 0 


FOR 

(r- 1,  Mil 

DO: 

Axil)  ■ 

AXfll  ♦ p(r» * 

[x*  1)1  rl)  -XII)] 

AY  (II  ■ 

A Yd) 

[ Y*  (j(r)l  -Yll)] 

A Ail)  • 

A Ad)  +p(D* 

[ A*  (jlrll  - All.] 

u 

COMPUTE: 

XIII  ■ 

XII)  ♦ 

/sx  d) 

• AX(I) 

T 

XII)  ■ 

XIII  + 

a X (II 

• AX  II)  + T*  X III 

Y(i)  • 

Y (I)  + 

tfx(l) 

• AY  II) 

T 

Y (II  • 

Y (1)  + 

axil) 

• AY  (1)  + T*Y  II) 

All)  ■ 

A (if  + 

0A  (1) 

• AAII) 

T 

A (II  ■ 

A(l)  + 

a A (i) 

• AAII)  + T*  A (1) 

Figure  5,4-6.  BTH  block  4 
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Figure  5.4-8.  Star  background  discrimination. 


TABLE  5.  4-3.  NOMINAL  STAR  DENSITY 


Star  Density 
(stars / deg2) 

Thre  shold 

Out  of  the 
Galactic 
Plane 

In  the 
Galactic 
Plane 

Stars  per  MFPA 
(16,  384  pixels ) 

Star  Density 
in  pixels 

30  watts /sr 

3 

•0 

6 

1 in  2731 

5 watts  /sr 

7 

2500  (?  ) 

84 

1 in  195 

Assuming  that  the  8 MIPS*  dedicated  microprocessor  tracker 
described  in  Section  4 of  this  report  is  available,  the  anticipated  star  densi- 
ties seen  by  an  MFPA  chip  can  be  handled.  In  fact,  110  updated  tracks  (the 
worst  case  of  p + 3<r)  per  frame  would  allow  7200  instructions  per  update  at  a 
frame  rate  of  10  Hz.  Other  considerations,  such  as  anticipated  mean  star 
spacing  and  desired  pixel  integration  times,  make  a frame  rate  between  1 
and  10  Hz  desirable.  Current  estimates  place  the  number  of  operations 
required  to  process  one  update  between  100  and  800,  Hence  8 to  64  MFPA 
chips  may  time  share  a single  pPT.  Track  parameters  include,  as  a mini- 
mum, two  spatial  dimensions,  intensity  and  a track  status  flag. 

*(4000  instructions /track  x 200  targets /frame  x 10  F/S  = 8 MIPS) 


5-23 


Figure  5.4-9  shows  a block  diagram  of  this  star  discrimination 
pr epr oces s ing  function.  Note  that  star  tracks  are  not  deleted  unless  several 
frames  (e.  g.  , 5)  pass  without  hits  in  the  predicted  positions.  Any  hit  which 
cannot  be  correlated  with  a star  is  treated  both  as  a new  track  and  as  a 
potential  target.  Only  such  hits  are  passed  on  to  the  system  tracking 
algorithms.  These  algorithms  are  designed  to  delete  tracks  moving  with 
the  star  velocity  vector. 

In  the  exceptional  case  that  targets  (ASATs,  RVs,  etc.  ) are  moving 
with  the  star  field  (in  speed  and  direction)  at  all  times,  they  cannot  reach  the 
sensor,  or  any  other  target,  and  therefore  need,  not  be  tracked.  If  this  rule 
is  insufficient  for  discrimination  purposes,  changing  target  amplitude  of  the 
target  could  be  used  to  augment  the  basic  velocity  discrimination  algorithm. 

The  standard  method  for  track  initiation  is  to  attempt  to  correlate 
consecutive  HITS.  This  is  accomplished  by  placing  a gate  around  a new 
detection  and  checking  the  next  frame  for  a HIT  within  the  gate.  For  low  S/N 
applications,  it  may  be  necessary  to  use  the  temporal  discrimination  filter 
discussed  in  5.3.  If  a target  were  moving  at  an  unknown  velocity  in  any 
random  direction,  TDI  requires  some  a priori  tracking  information.  Thus, 
the  designer  faces  a dilemma:  TDI  cannot  be  used  until  a track  has  been 
initiated,  and  track  initiation  itself  relies  on  some  sort  of  TDI  due  to  low 
S/N.  Several  approaches  should  be  considered  during  system  design  in  order 
to  resolve  this  dilemma. 

ATH  Software  Implementation 

The  ATH  tracking  algorithm  operates  in  two  modes: 

1.  Target  acquisition  mode,  in  which  all  threshold  excessions  are 
considered  and  stars  are  eliminated,  and 

Z Track  maintenance  mode,  in  which  tracks  designated  as  targets 
only  are  tracked. 

The  track  maintenance  procedure  is  executed  every  frame  time.  The  target 
acquisition  procedure  is  executed  periodically.  Currently,  it  is  contem- 
plated to  execute  target  acquisition  once  every  3 seconds.  Figures  5.4-10 
and  5.4-11  show  top  level  functional  flow  diagrams  of  the  programs  for 
target  acquisition  and  for  track  maintenance,  respectively. 


5-24 


INTtt 


EXIT 


Figure  5.  4-9. 


Star  preprocessing. 
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Figure  5.4-10.  Target  acquisition  flow  diagram  for  ATH  mode 
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Figure  5.4-11.  Tar 


maintenance  flow  diagram  for  ATH  mode. 


The  target  acquisition  procedure  start-,  out  by  computing  the  "motion 
vector".  This  vector  represents  the  shift  of  the  frame  of  reference  since 

target  acquisition  was  last  executed. 

Next  the  list  of  stars  compiled  during  the  last  target  acquisition  is 

scanned.  Each  sta  >s  position  is  modified  by  the  motion  vector  and  the  cur- 
rent observation  file  is  searched  for  a corresponding  entry.  If  no  such  entry 
is  found,  two  conditions  have  to  be  checked  for: 


the  star  may  not  have  been  observed  in  many  consecutive  frames  in 
in  which  case  it  is  deleted  from  the  star  list,  or 


The  star  may  have  left  the  field  of  view,  in  which  case  it  is  also 
deleted. 


If  neither  condition  exists  (but  the  star  was  not  found  in  the  observa- 
tion file)  the  star-list  entry  is  updated  by  extrapolation  and  the  miss-counter 

(a  field  of  the  star-list  entry)  is  incremented. 

Usually  however,  the  star  will  be  found  in  the  observation  file.  In 
tha,  case  the  respective  observation  is  marked  as  having  been  processed. 

The  entry  in  the  star  list  is  updated  based  on  the  observation  and  the  miss- 

counter  is  reset. 

Last,  it  has  to  be  checked  whether  the  star  being  processed  will 
cross  into  a neighboring  chip  before  the  subsequent  target  acquisition  period. 

If  that  is  the  case,  the  respective  neighboring  track  processor  is  notified. 

The  above  procedure  is  repeated  for  every  star  in  the  star  list. 

Next  the  current  list  of  targets  is  scanned  and  the  observation  file  is  searched 
for  entries  corresponding  to  each  target.  If  a corresponding  observation  is 
found,  it  is  marked  as  having  been  processed.  If  the  target's  motion  was 
identical  to  the  reference  star’s  motion,  then  the  entry  is  deleted  from  the 

target  list  and  added  to  the  star  list. 

If  no  observation  corresponding  to  a given  target  is  found,  then  the 

miss -counter  of  that  target  (a  field  in  each  target  list  entry)  has  to  be  checked. 
Targets  that  have  not  been  observed  in  many  consecutive  frames  are  deleted. 

The  above  procedure  is  repeated  for  every  entry  in  the  target  list. 

Next  the  observation  file  is  scanned  and  every  unmarked  (i.e.  , unprocessed) 
observation  is  added  to  the  target  list.  Thereafter  all  crossover  tracks  from 
neighboring  track  processors  are  added  to  the  star  list  or  target  list,  as  the 

case  may  be. 


5-28 


The  threshold  may  be  adjusted  to  maintain  the  processing  load  of  the 
track  processor  constant.  (The  maximum  capability  being  10  threshold 
excessions  per  second.) 

Last,  the  brightest  star  from  the  central  area  of  the  detector/mux 
chip  is  added  to  the  target  list  to  serve  as  reference  star  until  the  next  target 

acquisition  period., 

The  track  maintenance  procedure  is  executed  every  frame  time.  No 
additions  or  deletions  to  the  target  list  are  made;  only  continuation  points  of 

existing  tracks  are  determined  here. 

A gate  is  computed  for  each  track.  Typically  exactly  one  observa- 

tion  will  fall  within  the  gate.  If  no  observation  is  found  within  the  gate  then 
the  respective  target -list  entry  is  updated  as  if  an  observation  had  been  found 
in  the  center  of  the  gate,  and  the  miss-counter  is  incremented.  If  one  or 
more  observations  fall  within  the  gate,  their  centroid  is  computed  and  the 
respective  target-list  entry  is  updated  based  on  this  centroid.  The  miss- 
counter  is  reset. 

This  procedure  is  repeated  for  each  entry  in  the  target  list. 

The  execution  time  for  the  target  acquisition  procedure  and  for  the 
track  maintenance  procedure  are  estimated  to  be  less  than  0.  003  and  0.  018 
seconds  respectively.  This  indicates  that  up  to  32  detector/mux  chips  can 

be  handled  by  one  track  processor. 

In  summary,  star  discrimination  at  a level  1 64  stars  per  detector/ 
mux  chip  can  be  handled  uniquely  with  the  dedicated  on-board  trackers  by 
using  a simple  prediction  algorithm  which  eliminates  all  targets  that  move 
with  the  star  field  and  do  not  change  in  amplitude  overtime. 


Omni-directional  TDI  (OTDI) 

Here  a nominal  target  speed  is  assumed  and  integrations  proceed  m 
-all"  directions  (say  eight).  After  a definite  integration  time  all  results  are 
checked  and  the  largest  value  selected  (see  Figure  5.4-12). 

This  technique  consumes  a great  deal  of  computing  power  in  the 
trackers  and  generally  represents  "overkill."  A hand-over  of  RV  track  infor 
mation  from  the  BTH  sensors  to  the  ATH  sensors  includes  an  estimated 
velocity  of  6 pixels/sec  and  a direction  of  about  ±10  degree  uncertainty.  As 
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track  gate 


direction  of  integration 


PIXELS 


Figure  5.4-12.  Omni-directional  time  delay  and  integration. 

illustrated  in  Figure  5.4-13,  OTDI  then  needs  to  be  performed  only  over  a 
limited  30  degree  sector  and  velocity  range  of  ±1  pixels/sec. 

Two-dimensional  Transform  Techniques’1' 

A patch  around  the  new  potential  detection  is  integrated  in  time.  It  is 
subsequently  spatially  transformed  (WET  or  FFT)  and  the  two-dimensional 
spatial  frequency  diagram  (k-space)  examined.  In  the  integrated  input  patch, 
tracks  should  appear  as  weak  line  segments  which  are  detectable  m the  trans- 
form  domain  as  resonances  at  certain  points  in  k-space. 

It  is  recognized  that  both  of  these  approaches  require  further  study 
both  with  respect  to  their  scopes  of  applicability  and  their  data  processing 
requirements.  A tradeoff  must  be  made  between  the  FFT  and  WHT  transform 

techniques  and  others.  2 

In  summary,  star  discrimination  at  a level  of  2500  stars/deg  (84 
stars  per  MFPA  chip)  can  be  handled  uniquely  with  the  dedicated  on-board 
trackers  (uPT)  by  using  a simple  prediction  algorithm  which  eliminates  all 


>yr  KTP?att,  Image  Processing  Institute,  Summer  Course,  University 
of  Southern  California,  l^os  A-geles,  California,  1975. 
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Figure  5.4-13.  Selective  direction  time  delay  and  integraticn. 

targets  that  move  with  the  star  field  and  do  not  change  in  amplitude  over  time. 
La  addition,  for  certain  targets  with  low  signal-to-noise  ratios,  an  OTDI 
technique  and  a transform  technique  have  been  introduced  as  solutions  to 
enhance  the  detection  probability  of  weak  targets  (RVs).  These  algorithms 
will  be  implemented  in  the  trackers.  Finally,  the  trackers  can  be  instructed 
to  track  designated  targets  which  are  handed  over  from  the  BTH  subsystem. 
This  a priori  information  to  the  trackers  relaxes  the  computation  load  on 
the  U.PT. 
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This  section  contains  an  analysis  of  the  adaptive  video  encoder 
performance,  the  temporal  detection  filter  signal-to-noise  and  signal-to- 
clutter  performance,  the  Walsh  Hadamard  processor  and  the  tracker  per- 
formance. It  is  shown  that  a relatively  simple  a-0  tracker  is  effective  m 
tracking  maneuvering  targets  with  a fraction  of  a pixel  error  and  in  deleting 
false  tracks  after  a few  frame  times. 

6.  1 Adaptive  Signal  Encoder  Performance  Analysis 

This  subsection  consists  of  the  following  items: 

(1)  Simulation  Results 

(2)  Encoder  Noise  Analysis. 


Simulation  Description 

The  simulation  approach  taken  in  this  analysis  utilized  a two-phase 
investigation.  In  the  first  phase  the  predictive  feedback  encoder  of  I lg- 
ure  4.  1-2  was  modeled  assuming  no  D/A  or  summing  (E=S-P)  errors.  Addi- 
tionally, a single  A/D  with  5-bit  precision  and  infinite  word  length  was 
modeled,  instead  of  a two  channel  A/D.  This  yielded  an  idea  of  the  maxi- 
mum magnitude  of  the  difference  signal  E and  its  dependence  upon  the  input 
waveform  and  sample  rate.  Knowing  this,  it  was  possible  to  determine 
under  what  conditions  the  magnitude  of  E would  exceed  a 5-bit  word  size, 
i.  e.  saturation.  Eecause  saturating  values  of  E result  in  large  encoding 
errors,  the  dual  channel  A/D  shown  in  the  block  diagram  of  Figure  4.  1-2 
was  chosen  to  reduce  such  errors  for  large  difference  signals  while  retain- 
ing the  5-bit  accuracy  for  small  difference  signals. 
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Except  (or  the  (act  that  quantization  error  was  included  in  the  model, 
the  (irst  phase  might  be  considered  to  be  the  "ideal"  case.  The  second  phase 
modeled  the  encoder  as  shown  in  Figure  4.  1-2,  considerating  saturation 
ettects  and  A/D  quantization  error  but  again  neglecting  D/A  and  summing 
noise.  The  input-output  error  and  transient  characteristics  were  then  deter- 
mined  for  specified  input  signals. 


Simulation  Results 

Two  simulations  were  performed:  the  first  modeled  the  encoder 

neglecting  A/D  saturation  and  overflow  effects  in  order  to  look  at  the  differ- 
ence signal  magnitude  E and  see  under  what  conditions  it  would  saturate  a 
conventional  5-bit  A/D  converter.  It  was  found  that  the  maximum  E gener- 
ated was  a function  of  both  the  input  signal  and  the  number  of  samples /dwell 
time  denoted  by  t^T.  For  a sampled.  convolved-Gaussian.  unity  maximum 
amplitude  signal  input  («r  = 0.  1283d)"  the  A/D  would  saturate  at  values  of 
t /T  below  150  for  n = 1 and  below  35  for  n = 2.  Thus  the  channel  'b'  A/D 
converter  is  sufficient  to  encode  the  error  signals  for  higher  ^/T  values 
with  5-bit  accuracy  and  resolution,  while  for  all  lower  values  of  tq/T’  the 
channel  'a'  A/D  must  be  used  with  the  resultant  no  greater  than  5 LSB 


error. 


The 

1. 

2. 


second  simulation  determined  the 

RMS  input-output  error-vs -Td/Tf 
peak-vs  -t^/T^. 

Transient  Response 


following  characteristics: 
and  error  in  encoding  signal 


The  rms  and  peak  errors  are  defined  as  follows: 


€ 


rms 


K 


K 


I (sk 


1/2 


*In  this  expression  d is  the  detector  width  and  cr  is  a measure 
blur  width  based  on  the  60%  point  of  the  Gaussian  waveform. 


of  the 
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where 


S,  = MFPA  signal  at  time  k 

xC 

PC,  = encoder  output  signal  at  time  k 
AC 

K = number  of  samples  over  signal  duration 

= 2t  ,/T,  + 1 (see  figure  above) 
d f 

and 


' peak 


S,  - PC,  , 
k k| 


k = y~+  i 
l{ 


Figure  6.  1-1  shows  £rmg  as  a function  of  Td/Tf.  The  non-linear  nature  of 
the  curves  is  a result  of  the  non-linearities  introduced  by  the  two  -channel 
A/D  converter  and  the  prediction  clamping  (clamps  prediction  to  be  between 

0 and  1). 

At  one  sample/dwell  the  difference  signal  is  converted  by  the  coarse 
A/D  because  of  its  large  magnitude;  this  results  in  large  errors  due  to  the 
type  of  conversion. 

At  two  samples/dwell  the  difference  signal  is  in  general  large  but  its 
conversion  does  not  yield  the  large  quantization  error  as  before.  The  input- 
output  error  is  mainly  a function  of  the  A/D  conversion  error  which  is  a non 
linear  function  of  the  difference  signal  amplitude.  Conversion  error  is 
determined  by  the  magnitude  of  the  difference  signal  E;  If  E is  larger  than 
5 bits,  it  is  encoded  by  the  coarase  A/D  with  a 5-LSB  error;  an  E 5 bits 
or  less  is  encoded  by  the  fine  A/D  with  5-bit  precision.  The  error  charac- 
teristics of  the  encoder  thus  depend  on  the  frequency  of  occurrence  of  large 
difference  signals  E,  which  depends  in  turn  upon  the  ability  of  the  predictor 
to  track  the  input  waveform.  Thus  the  behavior  of  the  encoder  at  r^/ T£  - 2 
is  a result  of  the  fact  that  the  difference  signal  magnitude  was  such  that 
smaller  conversion  errors  were  introduced. 

At  large  values  of  t^/T^  the  curve  for  n = 2 levels  off  at  the  quantiza 
tion  noise  of  the  low-channel  A/D  converter.  This  is  due  to  the  decreasing 
magnitude  of  the  difference  signals  for  large  t^/T^;  such  smaller  signals 
can  be  encoded  by  the  A/D  with  less  quantization  error. 
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Figure  6.  1-1.  Rms  error. 

Except  for  one  sample /dwell,  the  input-output  error  remains  well 
below  1 percent. 

Figure  6.  1-2  is  a plot  of  «peak  as  a function  of  Td/Tf.  The  large 
error  at  one  sample /dwell  time  is  due  to  the  error  in  converting  the  large 
difference  signal  with  the  high-channel  A/D.  As  more  samples  are  taken, 
the  predictions  become  better  in  the  sense  that  the  smaller  difference  signals 
can  be  encoded  by  the  low-channel  A/D  with  resulting  less  error.  At 
values  of  t^/T  above  10,  the  quantization  noise  of  *his  low-channel  A/D 
dominates  and  the  curves  level  off. 

Again,  except  for  / T = 1,  the  error  is  well  below  1 percent. 

Figure  6.  1-3  and  6.  1-4  show  the  encoder  response  to  a step  function 
of  varying  amplitude  A . Except  ior  the  case  Aq  =0.1  the  curves  are  the 


0 
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ENCODER  OWTfUT  AMM.ITUOE 


A = 0,23 


TABLE  6.  1-1.  ENCODER  PERFORMANC E SUMMARY 


of  input  amplitude  > 


t 


6.1.2  Noise  Analysis 

To  gain  insight  into  the  noise  characteristics  of  the  encoder  consider 
the  following  development.  The  MFPA  output  is 

Q(k)  = S(k)  + N(k) 

where  S(k)  is  the  signal  and  N(k)  is  noise  of  arbitrary  distribution.  From 
this  is  subtracted  the  prediction  P(k)  given  by 

P(k)  - PQ(k)  + 'D/A(k> 

where  Pq  is  the  value  of  the  predicted  word  and  «d/a  is  t^ie  n°ise  introduced 
by  the  D/A  converter.  The  result  of  the  subtraction  is  the  difference  signal 

E(k)  - Q(k)  - P(k)  + £sum  (k) 

= Q(k)  - PQ(k)  - «D/A(k)  + %um(k) 

where  € is  the  noise  due  to  the  analog  summer.  Now  E(k)  is  converted 
sum 

to  a value 


EQ(k)  = E(k)  + £A/D(k) 

where  €A/D  is  the  noise  introduced  by  the  A/D  conversion  process;  Eq  may 
now  be  rewritten 

EQ(k)  = Q(k)  - PQik)  - *D/A(k)  + *sum(k)  + «A/D(k) 
and  if  the  encoder  noise  is  defined  by 

€enc(k)  ' €A/D(k)  + fsum(k)  " €D/A(k) 
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then 


EQ(k)  = Q(k)  - PQ(k)  + <enc(k) 

and  the  pred  cted  corrected  value  is 

PCQ(k)  = EQ(k)  + PQ(k) 

= Q(k)  + *enc(k)  = S(k)  + N(k)  + tenc(k) 

So,  the  output  value  is  the  same  as  the  input  value  except  for  the  encoder  noise 
introduced.  The  encoder  has  done  nothing  to  affect  the  input  noise  N(k).  If 
the  input  noise  is  assumed  to  be  much  larger  than  the  noise  introduced  by  the 
A/D  and  D/A  converters  (except  when  the  2-channel  A/D  is  operating  in 
channel  'a')  then  the  signal-to-noise  ratio  at  the  encoder  input  will  be  degraded 
by  any  non-negligible  noise  introduced  by  the  analog  summation  and/or 
channel  'a'  errors. 

6.  2 Temporal  Filter  Performance 

This  subsection  presents  typical  results  on  the  effectiveness  of  the 
temporal  detection  filter.  In  particular,  the  signal  to  clutter  ratio  before 
and  after  filtering  is  shown  as  a function  of  the  number  of  samples  per  dwell 
time  and  the  noise  equivalent  bandwidth  for  the  temporal  filter  is  calculated 
for  the  system  level  performance  analysis  purposes. 

The  electrical  signal  at  the  detector  output  is  filtered  to  eliminate 
the  low  frequency  scene  background.  Its  purpose  is  to  reduce  the  dynamic 
range  requirement  of  the  CCD  registers.  T"  1 nsfer  function  for  this 
circuit  is 


hb  (S) 


+ jw. 


B 


where  ug  is  placed  at  a value  low  enough  not  to  reduce  the  target  signals  of 
interest.  A reasonable  value  of  wg  / 2tt  = 0.  1 Hz  was  chosen  for  performance 
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analysis.  This  is  based  on  current  technology  capability  and  the  requirement 

of  passing  all  target  signal  frequencies  of  interest. 

The  current  from  this  AC  coupling  filter  is  integrated  and  held  for 
T seconds  in  the  CCD  storage  bucket.  The  resulting  charge  samples  are 
multiplexed  out  to  the  adaptive  video  encoder  and  converted  to  digital  words. 
The  transfer  function  for  the  sample  and  hold  circuit  is 

„ , v _ T f sin  uT /2  1 

HT  “ T [ cjT/2  J 


This  filter  serves  as  a band  limiter  which  prevents  noise  folding  into  the 
frequency  domain  occupied  by  target  signals. 

A Temporal  Discrimination  Filter  provides  each  pixel  with  a dedi- 
cated filter  which  provides  background  clutter  rejection.  The  discrimina- 
tion is  based  on  the  fact  that  the  targets  are  moving  relative  to  a stationary 
or  slowly  changing  clutter  scene.  Consequently  the  filter  must  emphasize 
the  frequencies  that  contain  target  energy  and  suppress  those  that  contain 
clutter,  i.  e.  , the  low  frequency  response  must  be  severely  attenuated. 

For  the  simulation  case,  it  has  been  assumed  that  a radiance  step 
moves  across  the  detector  aperture.  After  filtering  by  the  optics  and  the 
detector  convolution,  the  leading  edge  of  this  signal  closely  approximates 
a ramp.  The  slope  of  the  ramp  (in  time)  is  proportional  to  the  velocity  of 
the  clutter  edge.  The  lowest  order  digital  temporal  filter  that  will  give  zero 
response  to  a ramp  is  a second  differencing  filter.  It  will  have  a response 
only  at  the  corners  of  the  ramp.  A third  difference  digital  filter  provides 
zero  response  to  a parabolic  input  and  consequently  effectively  rejects  the 
ramp  that  is  smoothed  by  the  impulse  response  of  the  telescope. 

The  transfer  function  for  the  third-order  difference  filter  is 


Hd(Z)  = (1  - Z"1)3  = 1 - 3 Z" 1 + 3Z-2  - Z'3 
in  the  Z transform  domain.  The  frequency  response  is  given  by 


hd  <ejuT> 


= 2 3 [sin  (wT/2)]' 
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A transversal  filter  implementation  is  shown  in  Figure  3.4-2.  A target 
pulse  response  is  shown  in  Figure  3.4-4. 

The  system  frequency  response  is  shown  in  Figure  6.  2-1.  The  sys- 
tem response  was  obtained  for  several  different  frame  integration  periods 
each  corresponding  to  the  different  target  velocity  windows.  In  the  imple- 
mentation of  the  TDF  an  N-frame  accumulator  provides  this  function.  The 
value  of  N is  chosen  in  accordance  with  the  target  velocity.  The  noise  equi- 
valent bandwidth  corresponding  to  the  appropriate  filter  is  also  shown  in 
Figure  6. 2- 1 . 

The  curves  in  Figure  6.2-2  indicate  the  clutter  rejection  capability 
of  the  third  difference  TDF.  The  ordinate  is  the  output  to  input  amplitude 
ratio.  Both  the  target  and  the  clutter  edge  have  the  same  signal  power  within 
the  detector  area.  The  amplitudes  have  been  calculated  for  a 0.  1 second 
integration  time.  They  are  plotted  against  the  dimensionless  constant 

T/t,  = (Number  of  samples  per  dwell  time) 
a 


where 

T = integration  (frame)  time,  sec 
= target  or  clutter  dwell  time,  sec 

From  this  figure  it  is  apparent  that  ar  optimum  frame  time  to  dwell 
time  ratio  for  targets  is  about  0.5  whereas  for  clutter,  the  optimum  ratio  is 
zero.  Hence  optimizing  the  signal  response  subject  to  the  constraints  of 
maximum  signal -to-clutter  ratio  yields  a value  of  target  frame  time  to  dwell 
time  of  0.  2 to  0.  3. 

6.3  Tracking  Perfo*  mance 

The  purpose  of  the  track  processor  is  to  correlate  target  observations 
into  multiple  tracks.  The  principal  algorithms  for  doing  this  and  their  func- 
tions are  briefly  described  below. 

Association  and  correlation  refers  to  the  techniques  by  which 

received  observations  are  assigned  to  existing  tracks. 





Track  initiation  is  the  process  of  using  new  observations  that  are 
not  associated  with  existing  tracks  to  form  new  tracks. 

Track  deletion  algorithms  are  designed  to  remove  low  quality- 
tracks  from  future  consideration. 

Track  update  and  prediction  algorithms  are  used  to  incorporate  new 
correlating  observations  into  the  existing  tracks  and  form  new  state 
variable  estimates. 

Gates  are  formed  around  the  tracks'  predicted  positions  and  are  used 
to  limit  the  number  of  observations  considered  for  potential  track 
update. 

Measurement  Error  in  Time  Centroiding 

Time  centroiding  determines  the  position  of  a target  image  along  its 
direction  of  motion.  A common  method  for  motion-direction  centroiding  is 
the  determination  of  the  time  of  the  peak  signal  at  the  output  of  the  pixel  as 
shown  in  Figure  6.  2-3.  Notice  that  a threshold  is  shown  at  about  half  the 
level  of  the  expected  peak  signal.  This  eliminates  most  of  the  ambiguities 
caused  by  the  presence  of  the  target  on  more  than  one  pixel. 

Errors  are  introduced  by  the  presence  of  noise,  which  distorts  the 
signal  enough  to  give  a false  indication  of  the  peak.  Furthermore,  the  finite 
sampling  time  is  a cause  of  error,  since  the  time  of  the  maximum  sample 
is  uniformly  distributed  over  the  sampling  interval  with  respect  to  the  peak. 
Finally,  the  relationship  of  the  size  of  the  point- spread  function  with  respect 
to  the  size  of  the  pixel  affects  the  shape  of  the  output  signal. 

An  upper  bound  on  the  centroiding  error  can  be  found  from  the  case 
of  no  centroiding;  when  a detection  is  made  on  a given  pixel,  the  target  image 
is  assumed  to  be  in  the  center  of  its  path  across  the  pixel.  The  location  of 
the  peak  is  then  a uniformly  distributed  random  variable  along  the  path.  For 
the  path  in  case  A of  Figure  6.2-3,  this  yields  crc  = D4/12,  where  D is  the 
pixel  dimension.  For  the  path  in  Case  B,  crc^  = {sfz  D)^/12.  Normalizing 
these  errors  to  the  pixel  dimension  yields  crc  (Case  A)  - 0.  289  and  crc 
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Following  References  1 and  2 and  using  a maximum  likelihood 
approach  a score  function  (1^)  may  be  defined  for  the  incorporation  of  K 
frames  of  data  into  Nfc  tracks 
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where 


M = measurement  dimensionality, 

= track  (i)  length, 

P (D.)  = probability  of  track  length  Dx 
1 -Li  1 

B'  (n)  = true  (false)  target  density, 

PQ  ^ detection  probability, 

^ total  number  of  target  detections  including  the  initial 
observation 

mJ  = m.  - 1 , 

nK  ^ number  of  tracks  formed  based  upon  data  received 
through  frame  K, 

Y*  = residual  error  for  the  fth  update  of  the  i th  track. 

It 
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V.n  = residual  error  covariance  matrix 

i£ 


Techniques  for  using  Eq.  (6.  2-1)  to  determine  track  initiation  and 
deletion  criteria  are  discussed  in  Reference  2.  A desirable  property  of 
these  techniques  is  that  any  combination  of  observations  into  tracks  must 
lead  to  a score  function  greater  than  zero.  As  the  results  given  in  Refer- 
ence 2 indicate,  this  property  is  particularly  convenient  for  use  in  defining 
initiation  and  deletion  criteria  so  that  false  tracks  are  quickly  terminated. 
Also,  initiation  and  deletion  criteria  can  be  made  adaptive  in  an  optimal 
manner  to  the  environment. 

Ideally,  true  target  density,  probability  of  detection,  and  track  length 
statistics  as  well  as  residual  error  and  false  target  density  should  be  used. 
However,  these  parameters  are  not  presently  defined.  Thus,  to  derive 
preliminary  results,  tentative  algorithms  are  considered  whereby  initiation 
is  based  upon  receiving  four  correlating  observations  within  five  consecutive 
frames  and  deletion  occurs  on  six  consecutive  frames  without  a correlating 
observation. 

Figure  6.  2-4  illustrates  the  manner  in  which  track  initiation  and  main 
tenance  vary  with  the  detection  probability.  The  probability  of  having  a con- 
firmed track  is  presented  as  a function  of  frame  number.  Case  1 shows 
results  for  the  situation  whbre 


Pp  (case  1) 


0.4,  initial  detection, 
0.9,  during  initiation, 
0.95,  after  confirmation. 


Case  2 has  a cyclic  probability  of  detection  with  a period  of  33  frames 
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Figure  6.2-4.  Comparative  track  initiation  and  maintenance. 

For  track  filtering  and  prediction  of  both  aircraft  and  missile  targets 
the  classical  (Reference  3)  a-(3  tracker  is  proposed.  It  is  defined  by  the 
following  equations 


X (n)  = X (n)  + a(n)  AX(n), 

° p 

Xs(n)  = X>-1)  +-&M  AX(n),  (6.2-2) 


X (n+1)  = X (n)  + T X (n). 


where,  from  Reference  3,  s = smooth,  p = predicted  and  o = observed, 
a = (32/(2-a)  = coefficient 
X = state  vector 

AX(n)  = XQ(n)  - X^(n)  = difference  between  observation  and  prediction 
T ^ sampling  interval,  frame  time,  sec 

Aircraft  Position  Tracking  Results 

A covariance  analysis  applicable  to  maneuvering  aircraft  has  been 
performed  to  determine  prediction  error  standard  deviation  as  a function  of 
the  sampling  interval,  the  detection  probability  and  the  prediction  (or  extrap- 
olation) time.  The  results,  given  in  Figure  6.  2-5,  were  not  found  to  be  a 
sensitive  function  of  a but  are  given  for  the  a that  minimizes  the  prediction 
error  standard  deviation. 

The  assumed  white  measurement  noise  standard  deviation  (0.  29  pixel) 
corresponds  to  a uniformly  distributed  quantization  error.  Target  maneuver 
characteristics  are  defined  to  be  first  order  Markov  with  standard  deviation 
0.  25g  and  time  constant  of  3.0  sec.  Results  are  given  for  one  step  predic- 
tion and  extrapolation  periods  of  7.5  and  15  sec.  These  periods  correspond 
to  extrapolation  across  gaps  of  5 to  10  pixels  for  typical  aircraft  targets. 

Missile  Tracking 

The  proposed  missile  sampling  interval  is  0,  1 sec.  The  probability 
of  detection  is  expected  to  be  high  for  missiles  during  most  of  their  trajec- 
tory. Referring  to  Figure  6.  2-5,  tracking  performance  is  generally  insensi- 
tive to  detection  probability  when  the  probability  is  0.7  or  greater  and  when 
the  sampling  interval  is  small  (T  = 0.  25  sec).  Thus  the  results  presented 
below  will  assume  unity  probability  of  detection. 

Typical  missile  position  time  histories  appear  to  be  approximately 

characterized  by  a constant  acceleration.  Given  a constant  acceleration  the 

bias  error  (ev  ) for  an  a-S  tracker  is  given  by 
XP 


4 - 2a-a‘: 
2 

2a 


aT‘ 


(6. 2-3) 
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Figure  6.  2-5.  Maneuvering  aircraft  tracking  error 
standard  deviation. 


Typical  missile  accelerations  appear  to  be  about  0.  03  pixel/sec^ 
or  less.  Thus,  using  Eq.  (6.  2-3),  for  a values  of  0.  1 or  greater  and  a sam 
pling  interval  of  0.  1 sec.  , the  bias  error  is  negligible. 

Figure  6.2-6  shvV.o  atypical  transient  error  response  using  an  a 
of  0.  1 when  tracking  is  begun  at  20  sec.  These  results  include  the  effect  of 
the  quantization  measurement  error  for  a particular  case.  The  tracking 
error  shown  in  Figure  6.  2-6  does  not  quite  reach  steady  state.  In  steady 
state,  the  prediction  error  was  found  to  oscillate  between  0.  04  and  0.  14 
in  agreement  with  Eq.  (6.  2-3).  The  initial  error  can  be  reduced  by  devel- 
oping a special  initiation  technique. 


Figure  6.2-6.  Transient  missile  tracking  error. 

Intensity  is  considered  to  be  an  important  discriminant  for  missile 
tracking  and  identification.  Preliminary  results  indicate  that  an  a-(3  tracker 
will  also  suffice  for  intensity  tracking.  Figure  6.  2-7  shows  the  mean  inten- 
sity estimation  error  for  values  of  a = 0.  1 and  0.  2 for  a typical  case.  The 
estimation  error  is  given  as  a percentage  of  the  true  intensity. 

Multiple  Target  Effects 

Classically,  as  discussed  in  Reference  2,  multiple  target  interactions 
are  handled  by  using  gates  around  each  target's  predicted  position  as  a pre- 
liminary screening  device  and  a correlation  matrix  for  resolving  complex 
conflict  situations.  This  standard  technique  allows  at  most  one  observation 
to  be  assigned  to  a track.  However,  other  techniques,  discussed  in  Refer- 
ences 4 and  5,  propose  the  m e of  more  than  one  observation  for  track  update. 
The  choice  of  technique  will  be  studied  further  when  the  tracking  environ- 
ment is  defined. 
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Figure  6.2-7.  Intensity  tracking  error. 


Crossing  Tracks 

As  target  tracks  come  together  it  becomes  more  difficult  to  correctly 
assign  observations  to  the  tracks.  In  the  limit  where  both  targets  are  within 
the  same  pixel  there  is  no  unique  measurement.  Eventually,  some  type  of 
measurement  centroiding  logic  may  be  developed  using  the  methods  discussed 
in  References  4 and  5.  Using  this  logic,  observation  may  be  used  for  update 
by  both  tracks . 

One  technique  for  handling  a track  cross  is  to  predict  its  occurrence 
and  then  to  extrapolate  the  tracks  ahead  until  they  again  become  distinguish- 
able. Then,  regular  tracking  wov.id  be  reestablished.  Results  derived  using 
this  technique  should  give  a worst  case  bound  on  performance  because  the 
observations  received  during  the  extrapolation  period  are  not  used.  We 
assume  that  extrapolation  begins  and  ends  when  at  least  one  pixel  separation 
(in  e’fcher  dimension)  is  assured  with  probability  0.977  ,two  standard  devia- 
tion case). 


Issues  for  Further  Consideration 


As  discussed  above  it  is  necessary  that  the  parameters  of  the  tracking 
environment  be  specified  before  efficient  tracking  algorithms  and  realistic 
performance  estimates  can  be  defined.  This  is  particularly  true  of  the 
target  detection  probability  and  false  target  density  and  distribution. 

Further  study  should  be  performed  on  the  tradeoffs  between  compu- 
tational complexity  and  tracking  performance.  For  example,  in  the  filter- 
ing area,  the  increased  complexity  of  an  a-(3-y  tracker  would  allow  the 
tracking  of  an  acceleration  with  no  steady  state  bias  error.  However,  pre- 
liminary results  indicate  that  this  may  not  be  necessary.  Also,  as  the 
detection  probability  becomes  known  the  use  of  a Kalman  filter  may  be 
considered. 

Several  other  techniques  that  would  provide  tracking  improvements 
but  also  require  additional  computations  are  being  considered.  First,  fol- 
lowing References  1 and  2,  a centroided  measurement  technique  is  being 
considered.  In  the  presence  of  a high  false  target  density  and  a moderate  to 
large  sampling  interval  (T  >0.5  sec)  preliminary  results  indicate  that  this 
technique  should  reduce  the  tracking  error  and  the  probability  of  track 
divergence.  However,  this  method  is  computationally  complex. 

Finally  track  splicing  and  handover  to  other  sensors  is  an  important 
issue  in  the  system  context. 
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1.0  INTRODUCTION 


u 


This  report  represents  the  results  of  the  Device  Technology  Survey 
task  conducted  for  the  Adaptive  Programmable  Signal  Processor  (APSP) 
program.  The  purpose  of  this  task  was  to  determine  the  availability  (both 
present  and  projected)  of  semiconductor  devices  applicable  to  the  APSP 
design  effort. 

The  task  is  broadly  divided  into  three  portions: 

1.  A survey  of  both  present  and  projected  availabilities  of  semi- 
conductor devices  applicable  to  design  of  the  Layered  Array 
Processor  (LAP)  in  the  APSP.  The  survey  treats  both 
technologies  (e.g.,  I^L,  CMOS,  etc.)  and  devices  (e.  g.  , 
microprocessors,  memories,  etc.). 

2.  A brief  discussion  of  the  Adaptive  Video  Encoder  (AVE)  of 
the  APSP  in  terms  of  its  function  and  the  associated  critical 
devices  required. 

3.  Ti.  fabrication  and  testing  at  Hughes,  of  a variety  of  Hughes - 
designed  microelectronic  devices,  including  CCD  compatible 
bipolar  devices,  MOS,  integrated  injection  logic  (l^L)  and 
charge  coupled  devices  (CCD's). 

Examination  of  software  applicable  to  the  APSP  design  has  been 
deferred  to  a more  appropriate  forthcoming  report,  E-O  Processor  Definition. 
Another  forthcoming  report.  Critical  Device  Design,  will  with  this  study  as 
a basis,  examine  those  devices  and  processes  that  are  critical  to  the  APSP 
design,  as  defined  in  the  currently  ongoing  E.O.  Processor  Definition  Task; 
and  it  will  provide  preliminary  designs,  evaluate  associated  technical  risks 
and  supply  appropriate  development  schedules. 
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z.  0 DIGITAL  TECHNOLOGY  STATUS 


This  section  provides  a survey  of  the  present  status  and  probable 
future  of  those  digital  technologies  applicable  to  the  Adaptive  Programmable 
Signal  Processor  (APSP).  In  section  2.  1,  a brief  review  of  the  fundamental 
limitations  on  digital  devices  is  presented,  and  is  applicable  to  both  the 
basic  logic  gates  which  make  up  memory  and  logic  functions,  and  the  micro- 
processors  and  associated  digital  systems  utilizing  those  gates.  The  devel- 
opment of  commercial  microprocessors  is  reviewed  in  section  2.  2,  and 
anticipated  future  performance  predictions  are  developed. 


Section  2.  3 reviews  the  field  of  high  speed  low  power  memory,  as 
required  for  the  APSP  concept,  and  projects  its  future  trends. 

The  computational  capabilities,  size,  power  and  cost  of  the  APSP 
depend  upon  the  characteristics  of  the  digital  technology  available  in  the  early 
1980s.  Section  2.4  covers  the  generic  field  of  logic  devices  suitable  for  Large 
Scale  Integration  (LSI),  and  eliminates  ail  except  the  principal  contenders  for 
Low  cost,  low  power -delay  product  and  high  density  capability  within  the 
next  seven  years.  The  three  digital  LSI  technologies  that  emerge;  I L,  CMOS 
and  DMOS,  are  discussed  in  further  detail  in  sections  2.5,  2.6  and  2.7 

respectively. 

Figure  2.0-1  illustrates  the  present  power -delay  products  of  the 
various  LSI  technologies.  Ring  oscillator  circuits  are  used  as  the  standard 
of  comparison  for  new  technologies.  Typically,  an  order  of  magnitude 
degradation  in  power  delay  product  is  experienced  when  adding  the  fan-  out, 
interconnects  and  output  devices  associated  with  performing  large  arithmetic 
functions  with  LSI.  Anticipated  future  improvements  in  the  various  tech- 
nologies are  summarized  in  section  5.  0,  CONCLUSIONS. 
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Figure  2.0-1.  Mid  1975  LSI  technology  powci  delay  products 

1 fundamental  limitations  on  device  performance 

In  evaluating  various  device  technologies  for  computer  hardware 
pplications.  numerous  parameters  must  be  considered.  Those  devices  that 
ome  closest  to  optimizing  these  key  parameters  will  emerge  as  e mos 
uitable.  These  considerations  are  categorized  into  costs  and  values: 

Costs 

1.  chip  real  estate 

2.  number  of  processing  steps  required 

V alues 

1.  speed 

2.  speed-power  product 

A <=»n  s itv 


4.  noise  immunity 

5.  device  fan  out 

6.  power  supplies  required 

7.  compatibility  with  existing  technologies 

8.  natural  radiation  tolerance 

9.  life/reliability 

Of  primary  importance  is  speed-power  product.  If  the  circuitry  can  be  proc- 
essed sufficiently  small  and  a low  enough  speed-power  product  is  obtained, 
a good  deal  of  the  search  is  over.  Today's  technology  offers  a wide  selection 
of  logic  families,  ranging  from  the  excellent  speed  of  emitter  coupled  logic 
(ECL)  to  the  very  low  power  of  CMOS.  The  real  need,  however,  is  for  an 
optimum  compromise  of  speed,  power  and  silicon  real  estate.  Various  MOS 
technologies  (n-MOS,  silicon  gate  n-MG3,  CMOS)  have  been  addressing  this 
problem  for  some  time.  Recently,  bipolar  logic  has  become  a true  compet- 
itor to  MOS,  with  the  advent  of  Integrated  Injection  Logic  (l^L). 

It  is  only  a matter  of  time  before  sufficient  performance  will  be  avail- 
able to  enable  an  optimum  choice.  It  is  possible,  even  probable,  that  many  of 
the  decisions  regarding  applications  of  the  various  device  technologies  will 
soon  become  obvious.  For  instance,  since  the  improved  bipolar  logic  (I  L) 
must  constantly  draw  current,  even  in  the  off  condition,  it  might  be  more 
applicable  to  a continuous  data  system.  In  constrast,  CMOS  logic  consumes 
almost  no  power  in  the  off  condition,  rendering  it  suitable  for  systems  where 
data  is  being  handled  only  intermittently. 

To  push  speed-power  product  to  its  absolute  limits,  the  factors  that 
actually  limit  logic  performance  must  be  considered  in  detail. 

Swanson  * ” discusses  a hierarchy  of  limitations  in  regard  to  logic 

devices.  First  are  the  absolute  physical  limits  based  on  two  fundamental 
laws  of  physics:  thermodynamics  and  quantum  mechanics.  Three  different 
approaches  involving  basic  thermodynamic  properties  arrive  at  the  same 

conclusion;  that  the  energy  consumed  bv  a logic  gate  is  greater  than  4kT 

-20  Ef  a 

(1.7  x 10  J).  Since  power  (P)  = where  is  the  propagation  delay 

through  the  gate,  the  thermodynamic  limit  related  in  terms  of  power  is: 

4k  T 
d 
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where 


k = Boltzmans  constant 

T = ambient  temperature 
a 

During  a logic  state  transition  it  can  be  assumed  that  an  energy 
barrier  is  transversed  in  a device,  such  that  an  energy  Et  is  dissipated  in 
the  transition.  Swanson  states  that  the  minimum  time  for  this  to  occur  is 
about  h/Et  where  h is  Planck's  constant.  Thus  the  quantum  mechanical  limit 

requires : 


Td 


or 


E 


t 


and  in  terms  of  power,  since 


P 


this  yields 


P > 


These  power  considerations  hold  only  for  maximum  switching  rates 
and  comparisons  between  technologies  should  be  made  with  this  in  mind.  As 
mentioned  previously,  CMOS  integrated  circuits  consume  virtually  no  static 
power.  This  is  also  the  case  in  magnetic  core  memories.  Thus  duty  cycle 
or  standby  power  factor  is  another  important  design  consideration. 
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There  are  also  limitations  due  to  the  properties  of  the  material  being 

us  ed. 

Using  a potential  hill  to  simulate  the  channel  of  a MOST  or  the  base 
of  a bipolar  transistor,  it  is  shown  that  the  electric  field  that  can  be  supported 
by  a particular  device  limits  the  speed  of  the  device.  How  well  the  medium 
can  carry  heat  away  from  the  area  of  importance  is  another  limiting  factor, 
termed  the  thermal  conductivity  limit  by  Swanson.  Finally,  the  length  of 
time  required  to  propagate  a signal  to  interconnected  devices  also  limits 
speed.  All  of  the  above  factors  impose  approximately  the  same  speed  limi- 
tation on  1-ffic  circuits;  the  time  it  takes  to  perform  a single  compilation 

must  be  greater  than  3 x 10"14  seconds. 

The  fundamental  limitations  are  plotted  in  Figure  2.  1-1. 

Of  course,  actual  logic  device  performance  is  orders  of  magnitude 
away  from  these  theoretical  limits.  Processing  techniques  and  the  general 
technological  state -cf-the -art  in  the  various  device  technologies  have  become 

the  prime  performance  limiting  factors. 

The  technologies  which  are  promising  for  high  speed  - low  power 
operation  include  I2L,  CMOS,  DMOS,  CCD  logic  and  a few  others.  Each  of 
these  is  discussed  in  detail,  following  the  commercial  microprocessor 
and  memory  technology  surveys. 

2.2  COMMERCIAL  MICROPROCESSOR  DEVELOPMENT 

A microcomputer,  shown  in  Figure  2.  2-1,  is  a general  purpose 
computer  having  three  basic  elements:  Memory,  Control  and  a Micro- 
processor. The  memory  is  used  for  program  storage  as  well  as 
scratchpad  memory.  The  control  electronics  interface  uth  peripheral 
units  and  acts  as  monitor  and  router  of  data  within  the  .xicrocomputer . The 
microprocessor  contains  the  central  processing  unit  (CPU)  in  which  all  the 

arithmetic  and  logical  operations  are  performed. 

By  use  of  the  LSI  technology,  it  has  become  possible  to  place  the 
microprocessor  function  on  a single  LSI  chip.  As  semiconductor  technology 
produces  still  more  dens  5,  lower  power  devices,  more  and  more  functions 
will  be  incorporated  in  the  basic  microprocessor  chip. 
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Figure  2.2-1.  A basic  microcomputer. 


The  first  microprocessor,  introduced  in  1 97 1 , heralded  the  beginning 
of  a new  field  in  electronics  that  has  experienced  dramatic  growth. 

That  first  4-bit  PMOS  device  has  since  been  joined  by  more  sophisticated  and 
faster  microprocessors  implemented  with  many  other  technologies.  By  mid- 
1974,  the  number  of  8-bit  microprocessors  grew  to  over  twenty,  and  by  the 
end  of  1976  this  number  may  triple.  Figure  2.2-2  is  a diagram  of  micro- 
processor chronology. 

The  fact  that  microprocessors  have  been  introduced  so  recently  and 
the  fact  that  the  field  has  expanded  so  rapidly,  complicates  the  study  of  them 
and  makes  many  projections  of  future  development  somewhat  tenuous.  This 
section  shows  the  current  status  of  the  commercial  microprocessor  field 
and  illustrates  its  trends  of  development.  Using  these  trends,  an  attempt  will 
be  made  to  display  what  can  be  expected  to  be  developed  in  the  next  two  to 
five  years  and  thence  even  further  into  the  future. 


2-7 


U 


71 

1 I 1 

— 

■72 

1 1 

'73 

1 1 l— 

74 

1 1 1 

7, 

1 1 .]— 

PMOS 


NMOS 


CMOS 


BIPOLAR 


([>8008 

©4004 
0 PPS-2i 


-04040 


0 MINI-D  ® 5063 

^PPS^  0PPS-8 

u '\J^GPcyp  (gl  IMP/3  ®IMP'4 


0CMP/S 

YT  18O8O  02650  0 

0 6800  0 F-8 

0TLCS-I2 


SYMBOL 

MANUFACTURER 

AM 

AMI 

B 

BURROUGHS 

E 

ELECTRONIC  ARRAYS 

F 

FAIRCHILD 

1 

INTEL 

M 

MOSTEK 

N 

NATIONAL 

R 

ROCKWELL 

5 

SIGNETICS 

T 

TOSHIBA 

Tl 

TEXAS  INSTRUMENTS 

0 COSMAC 
06100 


CD 


® 6701 


©3001 


® RP-16 
©TMC/1601 

® _ 

© 

2901 


Figure  2.2-2.  Microprocessor  chronology. 


2.2.1  Microprocessor  Background 

Although  microprocessors  have  been  available  for  about  five  years, 
their  use  has  not  been  sufficiently  widespread  to  insure  universal  understand- 
ing of  them,  the  programs  that  control  them,  or  the  systems  of  which  they  are 
a part.  Thus  it  may  be  useful  to  describe  briefly  the  general  purpose  micro- 
processor as  well  as  the  resulting  programming  language. 

A microprocessor  is  a compact  digital  processor  implemented  in  LSI 
technology  on  one  or  a small  number  of  semiconductor  chips.  The  micro- 
processor corresponds  to  the  Central  Processing  Unit  (CPU)  of  a large  com- 
puter. The  microprocessor  typically  contains  an  Arithmetic  Logic  Unit 
(ALU)  to  perform  arithmetic  and  logical  operations,  one  or  more  accu- 
mulators, and  registers  for  temporary  storage  of  data  items  such  as  the 
program  counter,  instructions,  and  memory  addresses. 

Microprocessors  presently  available  are  characterized  by: 

1.  PMOS,  NMOS,  CMOS,  TTL  Schottky,  and  I2L  semiconductor 
device  technology. 

2.  A data  word  length  of  4,  8,  12  or  16  bits. 
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3.  Parallel  organization 

4.  Macroinstruction  cycle  time  from  0.  2 ps  to  60  ps. 

5.  Fixed  or  microprogrammed  instruction  sets. 

6.  Memory  address  capability  up  to  64K  words. 

7.  Instruction  sets  having  25  to  100  instructions. 

8.  Simple  input /output  structures. 

9.  Integrated  circuits  packaged  in  16  to  42  pin  dual-in-line 
packages . 

10,  Low  power  consumption  (<10  watts  total) 

From  an  applications  viewpoint,  the  microprocessor  can  be 
regarded  as  an  alternative  to  random  logic  or  custom  LSI  components. 

Using  a versatile  standard  microprocessor,  a complicated  system  can  now 
be  implemented  in  a matter  of  months  instead  of  the  years  that  might  be 
required  to  design  and  fabricate  a custom  LSI  device.  Instead  of  random 
logic,  a program  in  memory  is  used  to  control  a microprocessor  to 
accomplish  the  task  at  hand.  Thus,  the  microprocessor  accomplishes  jobs 

previously  done  sequentially  by  hardware. 

There  are  two  types  of  microcomputer  programming  to  be  under- 
stood. The  first  is  an  As sembly  language  type.  Microprocessors 
programmed  in  this  language  accomplish  discrete  tasks  as  required  by 
the  program  mnemonic.  For  example,  the  instruction  Add  to  Memory 

would: 

1.  fetch  a word  from  memory  as  addressed  by  the  instruction 
word  or  a register, 

2.  place  the  word  on  an  ALU  input, 

3.  place  the  output  of  an  accumulator  at  the  other  ALU  input, 

4.  place  the  ALU  in  an  add  mode 

5.  place  the  result  of  the  add  operation  into  the  accumulator. 

To  the  programmer,  it  would  appear  that  these  operations  took  place  simul- 
taneously. However  if  a check  were  made  of  the  time  required  for  this 
instruction,  it  would  be  seen  that  the  time  is  longer  than  for  an  Add  Imme- 
diate instruction.  The  difference  is  memory  access  time.  With  assembly 
language  type  programming,  only  the  required  functions  are  specified  and 
the  instruction  execution  time  will  always  be  a function  of  its  complexity. 
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Recently,  microprocessors  have  been  fabricated  which  have  either 
all  or  part  of  their  instruction  set  microprogrammable.  This  feature  allows 
the  user  to  define  his  own  instructions  by  writing  a program  describing  the 
discrete  steps  that  the  microprocessor  must  follow.  The  user  defined  word 
creates  the  proper  multiplexer,  ALU,  input/output,  and  memory  configura- 
tions to  accomplish  the  desired  task.  This  level  of  programming  is  called 
microprogramming  and  usually  the  microprogram  itself  is  called  firmware. 
A characteristic  of  firmware  is  that  each  microstep  is  accomplished  in  the 
same  amount  of  time  (a  microcycle).  The  advantage  of  microprogramming 
is  that  the  microprocessor  can  be  .a.lored  to  accomplish  a set  of  desired 
tasks  with  maximum  speed  and  efficiency.  The  disadvantage  of  this  method 
of  programming  is  that  the  firmware  is  totally  hardware  dependent  and  the 
programmer  must  be  familiar  with  the  hardware  at  a data  path  level. 

2.2.2  Computing  Power  of  Microprocessors 

The  first  microprocessors  introduced  in  1971  were  4-bit  PMOS 
machines.  By  19  75  numerous  8 and  16-bit  machines  became  available  in 
technologies  considered  superior  to  PMOS.  Recently  several  companies 
have  developed  microprocessor  elements  in  2 or  4-bit  slices,  that  can 
be  connected  together  to  produce  any  reasonable  word  length  that  is  a 
multiple  of  2 or  4. 

The  increase  in  the  microprocessor  word  length  is  worthy  of  ‘erest 
because  the  alternative  of  processing  double  precision  data  is  both  / 

and  time  consuming. 

Applications  requiring  mulv '.plication  or  division  i,so  provide  an 
inteiesting  challenge  for  the  designer  using  microprocessors.  In  a com- 
puter containing  only  ALUs  and  accumulators,  multiplication  or  division 
can  be  accomplished  by  doing  a series  of  add  and/or  subtract  operations. 

The  most  straightforward  multiply  algorithm  operates  by  shifting  the 
multiplicand  one  bit  at  a time  and  either  adding  or  not  adding  the  result  to 
an  accumulator  depending  on  the  status  of  a corresponding  bit  from  the 
shifted  multiplier.  Numerous  algorithms  have  been  developed  to  reduce  the 
number  of  necessary  shifts  so  that  an  N-bit  machine  does  not  necessarily 
require  N add  operations.  However,  the  multiply  is  still  accomplished 
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through  repetitive  adds.  The  inconvenience  of  this  type  of  repetitive  opera- 
tion can  be  circumvented  by  the  addition  of  special  hardware  dedicated  to 
multiplication  or  division.  Usually  with  the  addition  of  this  special  hardware, 
the  multiply  or  divide  can  be  accomplished  in  the  same  time  as  an  add  or 
subtract  operation.  Of  course  this  convenience  and  speed  is  gained  at  the 
cost  of  the  additional  hardware  necessary  to  implement  the  multiplication. 
Although  presently  the  multiply /divide  feature  can  be  had  only  by  the  addition 
of  hardware,  the  1 6-bit  microprocessor  being  developed  by  Texas  Instruments 
(reference  2.2-5)  has  multiply  and  divide  instructions  as  a part  of  its  instruc- 
tion set.  In  summary: 

1.  Double  precision  arithmetic  is  time  consuming  and  a microproc- 
essor should  be  selected  with  a proper  word  length  to  avoid 
double  precision  words  most  of  the  time. 

2.  Multiply  and  divide  operations  require  additional  hardware  or 
additional  time. 

Figure  2.  2-3  is  a block  diagram  of  the  recently  announced  AM  2901. 

The  circuit  is  a four -bit  slice  cascadable  to  any  number  of  bits. 
Therefore,  all  data  paths  within  the  circuit  are  four  bits  wide.  The  two  key 
elements  in  the  block  diagram  are  the  16 -word  by  4 -bit  2 -port  RAM  and  the 
high-speed  ALU. 

Data  in  any  of  the  16  words  of  the  Random  Access  Memory  (RAM)  can 
be  read  from  the  A-port  of  the  RAM  as  controlled  by  the  4-bit  A address 
field  input.  Likewise,  data  in  any  ci  the  16  words  of  the  RAM  as  defined  by 
the  B adJress  field  input  can  be  simultaneously  read  from  the  B-port  of  the 
RAM. 

The  high-speed  Arithmetic  Logic  Unit  (ALU)  can  perform  three 
binary  arithmetic  and  five  logic  operations  on  the  two  4 -bit  input  words  R 
and  S.  The  R input  field  is  driven  from  a 2-input  multiplexer,  while  the  S 
input  field  is  driven  from  a 3 -input  multiplexer.  Both  multiplexers  also  have 
an  inhibit  capability;  that  is,  no  data  is  passed.  This  is  equivalent  to  a "zero" 
source  operand. 

The  ALU  R-input  multiplexer  has  the  RAM  A-port  and  the  direct  data 
inputs  (D)  connected  as  inputs.  Likewise,  the  ALU  S-input  multiplexer  has 
the  RAM  A-port,  the  RAM  B-port  and  the  Q register  connected  as  inputs. 
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Figure  2.2-3.  Block  diagram  of  AM2901 
Microprocessor. 

The  ALU  itself  is  a high-speed  arithmetic/logic  operator  capable  ,of 
performing  three  binary  arithmetic  and  five  logic  functions.  The  3 micro- 
instruction inputs  are  used  to  select  the  one  of  eight  ALU  functions. 

The  ALU  data  output  is  routed  to  several  destinations.  It  can  be  a 
data  output  of  the  device  and  it  can  also  be  stored  in  the  RAM  or  the  Q register. 
Eight  possible  combinations  of  ALU  destination  functions  are  available. 

Also  included  is  a microinstruction  decode,  which  on  a clock  by  clock 
basis,  determines  the  operations  of  the  ALU  and  the  selected  data  paths 
within  the  microprocessor. 


2-12 


2.  Z.  3 Present  Technology 

To  enable  comparison  between  microprocessors,  standard  evaluation 
criteria  were  developed.  For  the  purpose  of  this  document,  the  following 
characteristics  are.  considered  of  prime  importance  for  the  APSP 
application: 

1 . Add  time 

Z.  Multiply  and  divide  capability 

3.  Power  dissipation 

Additional  considerations  are: 

1.  Possibility  of  nuclear  hardening 

Z.  Cost 

Before  these  items  are  evaluated  however,  some  pitfalls  in  comparing 
available  data  should  be  discussed.  For  less  complicated  ICs,  specification 
sheets  contain  data  that  is  fairly  well  standardized  from  one  vendor  to 
another.  For  microprocessors  however,  this  is  not  the  case;  different 
vendors  use  different  parameters  to  measure  microprocessor  capabilities. 
This  lack  of  standarization  makes  the  selection  of  a best  microprocessor,  on 
the  basis  of  application,  difficult. 

For  example,  microprocessor  computing  speed  may  be  given  as  the 
basic  cycle  time  or  clock  rate.  Although  this  is  a valid  parameter  when 
evaluating  microinstruction  (firmware)  cycle  time,  it  is  misleading  if  applied 
to  the  simple  -to-use  assembly  type  instructions  available  in  most  microproc- 
essors. Most  complicated  instructions,  like  the  Add  to  Memory  instruction 
mentioned  earlier,  will  require  many  of  the  basic  cycles  or  clock  periods 
for  execution.  For  example,  the  INTEL  8080,  "instruction  cycle"  time  is 
given  as  two  microseconds(Reference  Z.Z-5).  However,  an  8080  "minor  cycle 
time"  is  specified  as  greater  than  or  equal  to  500  nanoseconds.  This  implies 
that  the  number  of  cycle  times  for  most  instructions  would  be  four.  Exami- 
nation of  the  instruction  set  shows  that  some  instructions  require  fewer  than 
four  minor  cycle  times  while  others  require  more.  It  can  be  assumed  that 
the  two  microsecond  cycle  time  is  either: 

1.  an  average  execution  time  for  a particular  test  program,  or 

Z.  an  average  execution  time  for  the  entire  instruction  set. 
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Thus  it  is  necessary  to  evaluate  a microprocessor  with  a view  toward  its 
ultimate  application.  One  technique  is  to  construct  u "bench  mark"  program 
typical  of  the  device's  application,  and  use  this  program  to  evaluate  micro- 
processors of  interest  (reference  Z.  2-4). 

Realizing  this  caution  it  is  now  possible  to  examine  parameters  of 
particular  devices.  Figure  2.  2-4  was  reproduced  from  Reference  2.  2-2 
dated  15  April  1975,  and  compares  24  characteristics  of  22  different  micro- 
processors. In  order  to  condense  the  data  given  by  this  figure.  Figure  2.  2-5 
summarizes,  by  technology  type,  the  Memory  to  Register  add  times  for  the 
various  microprocessors  listed.  As  indicated,  all  add  times  are  in  the 
microsecond  range  except  for  the  bipolar  Intel  3000  series  at  300  nanoseconds 
(register  to  register).  Figure  2.  2-6  depicts  the  power  dissipation  of  devices 
of  various  technologies  based  upon  Mid-1975  commercial  production.  As  the 
figure  shows,  each  technology  occupies  a fairly  specific  power  /delay 
characteristic. 

Multiplication  and  division  must  be  implemented  with  adds,  subtracts 
and  shifts,  and  can  be  made  faster  by  additional  hardware.  The  soon  to  be 
released  TI  TMS9900  (not  shown  on  the  figures)  can  do  1 6 -bit  multiplication 
in  approximately  17  microseconds,  without  additional  hardware 
(reference  2.  2-5). 

The  add  times  given  are  for  the  word  lengths  of  the  particular  machine. 
Again  the  reader  is  cautioned  about  the  difficulties  of  handling  double  pre- 
cision words.  Thus,  although  a particular  microprocessor  may  be  able  to 
do  an  8-bit  add  in  2 microseconds,  in  no  way  should  this  imply  that  a 16 -bit 
add  would  require  4 microseconds.  As  a rule  the  time  to  add  the  double 
precision  word  is  more  than  double  that  of  a single  precision  word.  The 
added  number  of  instructions  might  be  as  many  as  10  or  15,  to  handle  the 
conditions  that  might  occur,  such  as  overflow  and  carry. 

In  summary: 

1.  An  N-bit  microprocessor  can  do  an  N-bit  add  to  memory  in 
about  2 microseconds  with  current  commerical  technology. 

2.  Double  pr  Asion  arithmetic  or  multiplication  can  be  executed 
on  any  microprocessor  but  at  the  expense  of  some  hardware 
or  a lot  of  time. 
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Figure  2.  2-4.  Microprocessors, 
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Figure  2.  2-6.  Speed  power  product. 
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3.  PMOS  is  currently  being  displaced  by  NMOS  and  CMOS  due 
to  the  latter  devices  reduced  power  requirement,  increased 
speed  and  TTL  compatibility. 

4.  16-bit  microprocessors  are  currently  available  that  operate 
at  about  the  same  speed  as  8-bit  microprocessors.  This 
implies  that  it  would  be  unnecessary  for  an  8-bit  device  to 
be  applied  in  a system  requiring  16 -bit  operands. 

2.  2.4  Development  Trends  and  Projections 

.1  rends  of  execution  speed,  chip  density,  and  power  can  be  analyzed 
by  looking  at  them  in  the  past  and  projecting  into  the  future.  It  should  be 
noted  that  since  microprocessors  have  been  around  less  than  five  years,  any 
observable  trends  have  developed  only  fairly  recently. 

Figure  2.2-7  lists  various  technologies  from  1965  to  1974,  and  pro- 
jects the  technologies  of  1980.  Each  technology  is  ranked  in  order  of  its 
anticipated  predominance,  so  that  while  PMOS  is  the  prime  commercial 
technology  in  1974,  it  will  nearly  disappear  by  1980.  By  1980  the  strong 
commercial  technologies  should  be: 

1.  i2l 

2.  CMOS 

3.  DMOS 

These  three  technologies  will  be  used  in  both  military  and  commercial 
applications.  An  evaluation  of  each  technology  based  on  the  parameters 
shown  in  Figure  2,2-9  yields  the  ranking  in  Figure  2.2-8.  Using  power  and 
speed  criterion  only,  it  would  appear  that  CMOS/SOS  would  best  serve  military 
needs  (see  Figure  2.2-8).  Note  that  CMOS/SOS  is  expected  to  be  the  most 
available  technology  in  1980.  Note  that,  for  1980,  the  technology  ranked 
highest  on  Figure  2.  2-8,  (CMOS/SOS),  is  at  the  bottom  of  the  list  of  tech- 
nologies shown  in  Figure  2.  2-7.  The  reason  that  Figure  2.  2-7  favors  what 
the  commercial  world  will  develop  due  to  parameters  such  as  yield  and 
density,  while  Figure  2.  2-8  is  based  upon  the  military  selection  parameters 
shown  in  Figure  2.2-9.  The  most  recent  advances  in  DMOS  \echnology,  not 
included  in  this  data,  typify  the  fluidity  in  the  semiconductor  field  and 
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TECHNOLOGIES 
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Chip  Size  (MAX)  MILS 
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Device  Density  (MIL)^/gate 

20 

1 

Speed  Power  Product 
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100 

5-10 

Clock  Rates  (MHz) 

20 

300 

Weight  Per  Gate  (lbs) 

5xl0-4 

1x10 
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0.  7-0.4 
0.  01-1 

2000 
lxl 0~7 


Figure  2.2-7.  Large  scale  integration  technology,  listed  in  order  of 
projected  share  of  the  commercial  market  as  a function  of  time. 
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Figure  2.  2-8.  Comparison  of  semiconductor  technologies  (excluding 
DMOS)  for  various  design  parameters. 


Nuclear -Effects  Survivability 


Systems  Level  EMP 
Signal  Conducted  EMP 
Nuclear  Transient  (Neutron, 
Neutron  Dose 

J Ionization  Total  Dose  (Electrons 


X and  Gamma  Rays) 

X and  Gamma  Rays) 


Electromagnetic  Vulnerability 

Communications  Security  (COMSEC) 
TEMPEST 

Jamming  (ECM,  ECCM) 

Radiation  Intelligence  (RINT) 


Viability 

Reliability 
'I  Availability 

Maintainability  _ . . 

J Electromagnetic  Compatibility  (EMC) 


Physical  Characteristics 

vV  Weight 

■J'J'J  Power  Consumption 
/ Cooling  Requirements 
/ Speeds 


Impact  Upon 
APSP  Application 

/ Minor 
//  Moderate 
///  Large 


Figure  2.  2-9-  Characteristics  for  military  uses  affecting 
selection  of  LSI  technologies. 


indicate  the  risk  associated  with  predictions.  DMOS  may  very  well  be  the 
dominant  LSI  technology  by  1978  or  79  (see  sections  2.4,  2.7),  if  its  con- 
siderable promise  is  realized. 

Figure  2.  2-10  gives  an  idea  of  the  trend  in  circuit  densities  for 
integrated  circuits.  According  to  this  figure,  memory  bit  density  in  1978  is 
expected  to  be  about  four  times  the  1974  value.  For  CPTjs,  the  density  is 
expected  to  double  in  the  same  period.  Basically  this  implies  that  the  phys- 
ical area  and  thus  the  number  of  chips  required  to  implement  future  systems 
will  decrease.  This  also  implies  that  the  amount  of  hardware  that  can  be 
cost  effectively  replaced  by  microprocessors  will  increase  in  the  future. 
Figure  2.  2-11  indicates  that  commercial  microprocessor  cycle  times  will 
drop  to  about  25-60  nsec  by  1982. 

As  mentioned  earlier,  evaluation  of  a microprocessor  from  its  speci- 
fication sheet  may  be  imprecise  due  to  lack  of  standardization;  benchmark 
programs  should  be  written  in  order  to  test  different  devices  for  a specific 
task.  A set  of  benchmark  programs  (Figure  2.  2-14)  was  written  for  a 
typical  application  and  the  results  of  running  these  programs  on  several 
microprocessors  are  shown  in  Figures  2.  2-12  and  2.  2-13.  Although  the 
number  of  instructions  and  amount  of  time  varies  widely  for  the  different 
devices,  it  should  be  noted  that  other  important  selection  parameters  such 
as  power  or  physical  size  of  the  devices  are  not  treated.  The  purpose  here 
is  not  to  demonstrate  one  device's  superiority  over  another,  but  rather  to: 

1.  Display  the  data  array  that  is  obtained  when  a benchmark 
program  is  run  on  various  microprocessors. 

2.  Demonstrate  that  since  microprocessors  were  first  intro- 
duced tremendous  changes  have  occurred  in  their  charac- 
teristics, and  that  the  next  few  years  are  expected  to  yield 
equally  significant  changes. 


Figure  2.  2-10.  Integrated  circuit  density  and  price  trends. 
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SPEED  IN  PSEC 

F;.gure  2.  2-13.  Program  memory  versus  speed  for  various 

microprocessors. 


PROGRAM  LISTING 


A. 


B. 


C. 


D. 


E. 


Movement  of  Blocks  of  Data 


SET 

I MOV 

# Base  1,  R[ 

MOV 

# Base  2,  R2 

UP 

I MOV 

# Char,  R3 

LOOP  1 

MOVB 

<Rl)  + , (R2)  + 

SOB 

EXIT 

R-j,  Loop  1 

Servicing  Interrupt 

MOV 

MOV± 

MOV 

MOV 

#INT  LOC,  Rj 
PC,  (R0  + 
ACC,  (Rl)  + 
FLAGS,  (R 1 )+ 

MOV 

-(Rl),  FLAGS 

MOV 

-(Rj),  ACC 

MOV 

(Rl),  PC 

Addition  of  " 

N"  Decimal  Digits  and  Store 

[MOV 

# Base  1 , Rj 

SET 

|mov 

# Base  2,  R2 

U7^ 

| MOV 

#1010,  r3 

Imov 

#N,  R4 

LOOP  1 

MOV 

(Rl )+,  R4 

<:ADD 

<R2).  r4 

MOV 

R4,  (R2j  + 

R3i  LOOP  1 

SOB 

EXIT 

Search  for  a 

Character  String 

MOV 

# Mask,  Ri 

MOV 

# Char,  R2 

MOV 

#0,  R3 

LOOP  1 

CMPB 

#255,  R3 

BEQ 

EXIT 

MOV 

(R3)  + , r4 

CMPB 

Rl,  R4 

BEQ 

LOOP  2 

MOV 

# Char,  R2 

JMP 

LOOP  1 

LOOP  2 

SOB 

R2,  EXIT 

JMP 

EXIT 

LOOP  1 

Monitor  8 Data  Channels 


MOV 

INT,  Rl 

MOV 

(Rl),  r2 

INCB 

r2 

MOV 

R2,  (Ri) 

EXIT 

Program 

Set  Up 

Move  Time/ 

Bytes 

Time 

Character 

Micro-Level 

34 

3.  3ns 

3.  Ops 

Macro-  f,evel 

10 

7.  8(is 

6.  Ops 

Program 

Service 

Bytes 

Time 

M;cro-Level 

42 

9.  Ons 

Macro-Level 

14 

19.  5(is 

Program 

Set  Up 

Add  Time  / 

Bytes 

Time 

Byte 

Micro-Level 

46 

4.  2iis 

5.  l(is 

Macro- Level 

16 

10.  2(as 

1 1 . l(is 

Program 

Set  Up 

Search  Time  / 

Bytes 

T ime 

Character 

Micro-Level 

42 

4.  2(is 

4.  5ps 

Macro-Level 

24 

8.  7 (is 

15. Ops 

Program 

Through  Put/ 

Bytes 

Character 

Micro-Level 

20 

3.  3(xs 

Macro-Level 

8 

9.  3(iS 

^Special  Macroinstruction  composed  of  AL,  ABF,  CAD  afier  receiving  (R2) 
**In  response  to  activity  indicated  over  interrupt  line 


Figure  2.  2-14.  Benchmark  programs . 
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2.2.5  E xamples  of  Microcomputer  Usage 


The  following  are  two  recent  examples  of  microprocessor  usage  in 
systems  design. 

The  first  is  the  HMC-1820  used  as  a controller  of  computer  peripheral 
Figure  2.  2-15  shows  the  total  block  diagram  of  the  controller  and  Table  2.  2-1 
summarizes  the  design  highlights. 

The  second  example  is  the  Hughes  Militarized  Microcomputer  (MMC) 
developed  as  the  Central  Processor  Unit  (CPU)  for  such  applications  as  radar 
signal  processors,  and  missile  guidance  computers. 

Figure  2.  2-16  is  the  block  diagram  of  the  MMC,  and  Table  2.  2-2  con- 
tains the  major  design  characteristics. 
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TABLE  2.2-1.  HMC-1820  DESIGN  FEATURES 


Architecture /Performance 
Basic  Architecture: 
Microinstruction  Length: 
Data  Word  Length: 

Microinstructions: 


ROM  Size: 

Registers: 

Microinstruction 
Execution  Time: 

Interrupts: 


Support 

Software: 

Firmware  (option): 
Hardware  (option): 


General  register,  microprogrammed 
Twenty  bits 

Sixteen  bits  (HMC- 1 620) /Eighteen  bits 
(HMC-1820) 

Twenty-eight  Arithmetic /Logic 

Twenty- eight  Immediate  Arithmetic /Logic 

Four  Flag  Control 

Four  Shift  /Rotate  (3  in  HMC-1820) 

Five  Conditional  Branch 

Three  Unconditional  Branch  (Including 

indirect  and  address  modified  branches) 
Ten  Input /Output  (optional) 

512  Words  minimum 

Expandable  to  4096  words  in  512-word 
increments 

Eight  general  purpose 
One  Shift/Rotate 

333  Nanoseconds  (250  nsec  optional) 

667  Nanoseconds  (500  nsec  optional) 

for  immediate  operand  or  if  branch  taken 

Eight  priority  interrupts,  vectored 
Microprogram  controlled  interrupt  enable 
One  level  of  interrupt  return  address 
storage 


Cross  assembler  for  IBM  360/  70 

Microcontroller  Test  Program 

Operator  console  with  bi  eakpoint,  snapshot 
and  single  clock  controls  to  aid  in 
microprogram  debugging 
ROM  Simulator  for  microprogram  checkout 
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Number  System  Binary,  Fixed  Point 

Data  Representation  Fractional  2's  Complement 

Operation  Parallel  arithmetic,  16  bits /word 

Instruction  Set  50  instructions 

Double  Precision  Instructions  Add,  Shift  Left,  Shift  Right 

Registers  8 General  Purpose  Registers 

Interrupts  4 Vectored  Priority  Interrupts  and 

Power  Fail  Interrupt 


Operat  on  Times 


Processing  Speed 


Addressing  Modes 


Load/Add/Sub  2.2  psec 

Store  2. 6 psec 

Multiply  9.  0 psec 

Divide  13  psec 

Jump  2.  0 psec 

(Unconditional) 

400  KOPS  (Ihousand  operations  per 
second) 


Direct,  Indexed  Relative,  Relative 
Indirect,  Indirect,  Immediate,  Register 


Control  Processing 


Microprogrammed  Control 
Program  Stored  in  512  x 32  ROM 


> 


2.  3 MEMORY  TECHNOLOGY 

In  this  sect! 03  the  various  key  parameters  which  dictate  the  appropi  ate 
types  of  memories  are  discussed  and  those  memory  technologies  which  best 
fit  are  enumerated.  The  particular  requirements  wMch  are  discussed  are 
non-volatile  program  store,  scratch  pad,  microcode  storage  (ROM)  and  large 

serial  memories. 

Space  borne  signal  processors  have  memory  requirements  which 
cover  a wide  spectrum  of  technologies.  If  there  is  to  be  a general  purpose 
computer  with  stored  program  it  will  require  a random  access  memory  (RAM) 
in  which  the  program  resides.  In  general  the  contents  of  this  memory  are 
seldom  changed,  but  should  be  capable  of  being  updated.  This  requires  a 
non-volatile  (i.  e.  , maintains  data  without  power ) writable  memory.  There 
are  presently  three  candidate  technologies:  magnetic  plated  wire,  MNOS, 
and  ultra-violet  erasable  MOS  structures.  Program  storage  memory  typically 
will  have  a radiation  hardened  specification  since  it  is  imperative  that  the 
computer  program  survive  (at  least  critical  portions  of  it)  if  the  general  pur- 
pose computer  is  to  be  useful.  Presently  2 mil  plated  wire  appear,  to  be 
best  suited  to  these  above  requirements  as  well  as  low  power  and  weight. 

The  two  semiconductor  technologies  have  the  disadvantage  of  complex  erase 
procedures,  and  a significant  amount  of  overhead  for  the  write  circuitry. 

The  ultra-violet  erasable  memories  presently  fatigue  after  some  number  of 

write-erase  cycles  (~  100  - 1000). 

In  contrast  to  the  program  store  which  should  be  capable  of  update, 

the  microcontroller  which  stores  the  firmware  which  drives  the  microproc- 
essor need  not  be  updated.  This  memory  is  a read  only  (ROM),  usually  less 
than  one  thousand  words,  and  of  comparatively  high  speed.  The  speed  of  this 
memory  will  usually  determine  the  minor  cycle  time  of  the  processor,  thus 
the  memory  cycle  time  must  not  exceed  the  register  to  register  data  path 
delays  internal  to  the  processor. 

Another  general  class  of  memories  are  read/write  random  access 
memories  used  as  scratch  pad  for  temporary  storage  in  typical  arithmetic 
computations.  These  memories  should  ha-e  read  access  times  of  about 
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1 - 3 times  the  minor  cycle  time  of  the  processor  ard  in  general  are  not 
required  to  be  non-  volatile.  For  this  class  of  memories,  it  is  expected  that 
the  memory  technology  will  be  the  same  as  the  logic  technology  (I  L,  DMOS, 
CMOS /SOS).  Table  Z.  3-1  is  a summary  chart  of  basic  characteristics  of 

random  access  memories. 

The  third  class  of  memories  are  very  large  data  banks,  particularly 
in  the  special  purpose  signal  processors  where  there  is  a requirement  for  a 
significant  (16  - 80)  bits  per  pixel.  By  the  nature  of  the  focal  plane  readout 
method  (pixel  serial  on  an  MFPA)  the  natural  memory  would  be  serial. 

The  memory  could  be  volatile,  which  implies  that  during  power  outage 
data  is  lost.  The  potential  size  of  this  memory  places  highest  priority  on 
low  power  and  maximum  bit  density  ,o  reduce  weight.  The  prime  candidate 
technology  appears  to  be  CCD  memories.  (Note  that  two  other  memory 
technologies  are  not  considered  viable  contenders  for  the  APSP  application 

(1)  magnetic  bubble  memories,  primarily  for  reasons  of  speed  and  bulk,  an 

(2)  optical  memories  for  reasons  of  power  and  mechanical  reliability. 

Present  bubble  memories  are  limited  to  approximately  100  kHz  and  have  a 
fundamental  materials  limit  at  1 MHz.  The  APSP  application  requires 

1. 64  MBit  capability  (Ref.  CDRL  A006  APSP  Architecture  Study).  The 
relatively  high  power  requirements  of  optical  memories  coupled  with  reli- 
ability concerns  caused  by  their  moving  elements  effectively  preclude  their 
use  for  satellite  applications,  and  restricts  their  use  even  in  ground-base 

systems.  ) 

Z.  3.  1 CCD  Memory  Technology 

The  primary  advantages  of  CCD  memory  compared  to  conventional 
digital  shift  registers  or  other  digital  memory  devices  are  low  power  dis- 
sipation, small  element  size,  and  potential  low  cost  per  bit.  For  ground 
based  svstems,  the  power  saving  and  small  size  are  probably  of  secondary 
importance,  whereas,  for  space  borne  equipment,  these  would  be  significant 

factors  in  large  bulk  memories. 

The  serial  structure  of  the  CCD  does  not  directly  lend  itself  to  random 
access  storage,  and  the  dark  current,  due  to  thermally  generated  carriers, 
tends  to  degrade  stored  data  as  a function  of  time.  These  devices,  therefore, 
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TABLE  2.3-1.  BASIC  CHARACTERISTICS  OF  MAGNETIC  CORE,  PLATED  WIRE,  AND 

SEMICONDUCTOR  MEMORIES,  1975  TECHNOLOGY 


are  dynamic  memory,  and  the  data  must  be  regenerated  not  only  as  a function 
of  time  when  the  clock  frequency  is  low,  as  is  usually  the  case  in  a standby 
mode,  but  also  after  a number  of  transfers  due  to  charge  losses.  The  CCD 
digital  memory  organization  providing  the  highest  density  on  the  chip  is  a 
"serpentine"  arrangement  of  long  chains  of  serially  connected  shift  registers. 
This  configuration  requires  a minimum  of  peripheral  circuits  which  use  a 
relatively  large  amount  of  silicon  real  estate,  and  provides  a highly  repetitive 
(and,  therefore,  efficient)  arrangement  for  the  CCD  itself.  To  reduce  access 
time,  the  CCDs  can  be  organized  in  a serial-parallel-serial  (SPS)  arrange- 
ment or  in  a parallel  multiplexed  arrangement  called  line-addressable 
random  access  (LARAM).  The  SPS  pattern  is  very  efficient  in  power  require  - 
ment.  It  employs  an  information  flow  as  shown  in  Figure  2.  3-1.  The 
memory  is  entered  with  a high  speed  serial  shift  register;  when  the  input 
register  is  fully  loaded,  all  of  its  cells  are  transferred  simultaneously 
downward  to  an  output  register  which  is  read  out  serially.  Because  of  the 
parallel  transfer,  the  downward  shift  occurs  only  once  for  each  row  as  a 
whole,  thus  the  row  gates  operate  at  1/n  times  the  frequency  of  the  input 
register,  and  the  power  requirement  for  the  interior  rows  is  relatively 
small  compared  to  that  of  the  input  and  output  registers.  In  addition,  the 
M-fold  increase  in  time  allowed  for  the  parallel -downward  transfer  helps  to 
achieve  a high  charge  transfer  efficiency  and  low  power. 


Figure  2.3-1.  SPS  memory  data  flow. 
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A manufacturing  problem  of  the  SPS  design  is  that,  unlike  the  ordinary 
linear  register,  all  the  data  do  not  traverse  the  same  path  in  the  CCD.  Local 
imperfections  on  the  surface  of  the  chip  may  distort  an  individual  portion  of 
me  data.  The  line  addressable  memory  structure  provides  independent 
access  to  each  cell  of  the  input  and  the  output  register,  so  that  a defective 
line  can  be  wired  out  of  the  circuit  by  connections  inside  or  outside  the 
integrated  circuit  package. 

The  practicality  and  usefulness  of  CCD  digital  memory  have  become 
apparent.  Several  companies  are  now  offering  as  stock  items  memory  chips 
wivhi  16  kilobit  capacity  and  line  address  organization  for  commercial  tem- 
perature range.  These  chips  have  relatively  slow  access  time  but  are  small 
and  need  little  power. 

An  example  of  more  advanced  technology  is  the  Hughes  SPS  type  2069 
memory  chip,  shown  in  Figure  2.3-2,  which  has  32  kilobit  capacity  and 
requires  only  5 mW  input  power  at  1 MHz  clock  rate.  A new  version  of  this 
memory  is  being  built  with  a smaller  basic  cell  size  and  with  64K  bits.  It 


Figure  2.  3-2.  Hughes  2^  (32,  768  bit)  memory  chip  2069. 


appears  to  be  quite  practical  to  increase  the  speed  and  the  chip  capacity  by 
orders  of  magnitude  within  the  coming  years.  The  most  modern  projection 
photolithography,  mask  making,  and  processing  techniques  will  be  required. 
Table  2.3-2  presents  the  range  of  digital  memory  CCDs  now  available  as 
ex  erimental  or  stock  items.  A future  expectation  of  a single  chip  with 
1 MHz  clock  rate  and  106  bit  capacity,  organized  in  very  long  registers,  is 
quite  realistic. 

Most  of  these  chips  are  intended  for  a commercial  market  and  may 
not  necessarily  meet  the  military  temperature  requirements.  The  curve 
shown  in  Figure  2.  3-3  illustrates  the  area  of  application  of  these  digital  CCD 
memories,  showing  the  per  bit  cost  versus  speed  of  operation.  If  power  dis- 
sipation, mechanical  reliability,  and  other  factors  are  taken  into  account  the 
CCD  memory  may  also  be  competitive  with  disc  memory  at  the  lowest  cost 

level. 

2.4  DIGITAL  LOGIC  FAMILIES 

In  this  section  the  key  digital  LSI  technologies  are  presented.  These 
separate  into  four  general  branches:  bipolar,  MOS,  CCD,  and  miscellaneous 

technologies  The  individual  categories  are  described  and  evaluated  in  terms 

TABLE  2.3-2.  CHARACTERISTICS  OF  SINGLE  CHIP  CCD  MEMORIES 


Technology 

Bits 

Clock  Speed 

Power,  mW 

Access  Tim-, 

Ps 

Operate 

Standby- 

Recirculate 

Nom 

Max 

n- Fairchild  CCD  450  (1975) 

n Channel 

9 x 1024 

50  kHz  - 3 MHz 

250 

30 

168 

340 

♦ Bell  Northern  (1974) 

8 ..  1024 

1 MHz 

128 

256 

* Intel  Z4  16  ( 1975 ) 

16  x 1024 

1.3-2  MHz 

200 

1 Fairchild  ( 1975) 

n Channel 

16  x 1024 

5 MHz 

200 

+Westinghouse  (1975) 

2048 

Non  volatile  m« 

;mo  ry 

50 

12.8 

25.6 

+Fairchild  (1975-76) 

32K 

+ Hughes  2068 

n Channel 

32K 

1.7  MHz- 20  MHz 

10 

Low  Power  Memory  Chip 

^Offered  for  sale  as  stock  items,  1975 
+Experimental  or  proprietary  devices 


2-39 


COM 


10 


■> 


Figure  2.  3-3.  Application  scope  of 
CCD  memory  devices. 
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of  potential  application  to  the  LSI  requirements  for  the  APSP.  Only  those 
technologies  that  appear  as  certain  contenders  for  high  speed  low  power  LSI 
are  carried  further.  I2L,  CMOS  and  DMOS  appear  to  be  the  most  likely 
technologies  to  fulfill  this  role  in  the  early  1980s. 

2.  4.  1 Bipolar  LSI 

Until  recently  the  progress  of  bipolar  LSI  has  practically  been  at  a 
stand- still  while  MOS  technology  continued  to  advance  and  thus  has  dominated 
the  LSI  scene.  Bipolar  logic  has  remained  the  industry  vorkhorse  for  high 
speed,  but  has  not  evolved  effectively  into  LSI  for  two  basic  reasons;  size 
and  power.  Size:  the  standard  T L LSI  gate  requires  20  sq  mils  compared 
with  P-MOS  with  11  sq  mils  or  N-MOS  with  5-1/2  sq  mils.  Power:  in  order 
to  maintain  gate  delays  less  than  10  ns  per  gate,  power  consumption  becomes 
10  mw.  Larger  wafer  size  can  accommodate  the  large  per  gate  area  require- 
ment (at  the  sacrifice  of  yield)  but  the  high  power  consumption  for  a moderate 
sized  300- gate  chip  at  10  mw  per  gate  would  be  3 watts,  requiring  external 
cooling.  In  order  to  reduce  per  gate  power,  larger  resistors  are  needed 
to  limit  current,  which  increases  the  area  requirement,  thereby  aggravating 
the  yield  problem. 


The  "bottom  line"  is  power -delay  product.  For  SSI  T^L  the  power 
delay  product  is  approximately  100  pico  joules.  A 300-gate  chip  with  a total 
power  dissipation  of  300  mw  would  limit  per  gate  power  to  1 mw  and  gate 
delay  would  grow  to  100  ns.  At  100  ns  delay  per  gate  bipolar  logic  loses  its 
speed  advantage.  Going  further  to  1000  gate  equivalent  LSI  circuits  the 
problem  becomes  even  worse  with  very  large  wafer  sizes  and  even  slower 
logic.  Consequently,  the  growth  oi  bipolar  LSI  has  been  very  limited. 

ECL-10K  (Emitter  Coupled  Logic)  is  the  highest  speed  logic  tech- 
nology currently  available  from  multiple  sources.  With  this  technology, 
production  integrated  circuits  have  been  built  which  exhibit  gate  propagation 
delays  ranging  from  600  pS  for  .10mA  switched  current  to  5 ns  for  a 
1 mA  current.  This  performance  can  be  realised  in  relatively  h.gh  yield 
LSI  arrays.  Moreover,  the  technology  is  sufficiently  well  established  that 
even  in  relatively  low  volume  production,  LSI  components  exhibit  mmimal 
part  failure  rates.  However,  as  seen  on  the  power-delay  curve  of  Figure  2.  1-1 
the  power  consumed  by  ECL-10K  and  related  bipolar  technologies  is  hig  . 

It  appears  that  the  power  delay  product  will  not  approach  the  requirements 
of  the  space-borne  APSP. 

Integrated  Mention  Logic  (I2L)  or  Merged  Transistor  Logic  (MTL) 
was  introduced  as  a new  form  of  bipolar  logic  circuit  in  two  papers  pre-^  ^ 
sented  at  the  1972  International  Solid-State  Circuits  Conference. 

I2L  represents  a maj  >r  advance  in  high  density,  low  power-delay  product 
bipolar  logic,  and  is  one  of  the  promising  technologies  for  the  LSI  APSP 
requirements.  The  evolution,  present  status  and  future  of  I L are  discusse 

in  detail  in  section  2.  5 of  this  report. 


2.  4.  2 MQS  LSI 

Current  production  and  future  high-speed  MOS  LSI  technologies  are 
described  below  in  essentially  chronological  order  of  their  development. 
Diagrams  depicting  current  state-of-the-art  device  cross  sections  are 
included  along  with  each  discussion  in  most  cases. 


f 


P-MOS  --  The  first  LSI  technology,  P-MOS,  was  introduced 

commercially  in  1967.  Offering  high  density  and  a minimum  number  of 

processing  steps,  it  opened  up  the  electronic  calculator  market  by  providing 
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highly  complex,  low  cost  arithmetic  circuits.  ' P-MOS  memory  cir- 
cuits soon  followed  and  have  since  been  produced  in  greater  volume.  The 
production  of  these  ICs,  shift  registers,  read-only-  and  random-access - 
memories,  formed  a foundation  of  experience  on  which  current,  more 
advanced,  MOS  memory  technologies  are  built. 

The  cross  section  of  a typical  P-MOS  transistor  is  shown  in  Fig- 
ure 2.  4-1.  a single  boron  diffusion  process  step  is  used  to  form  both 
source  and  drain,  and  channel  length  is  defined  implicitly  by  the  distance 
between  diffused  areas.  In  normal  switching  circuit  operation  a 

negative  voltage  (larger  than  a specific  threshold  voltage)  applied  to  the  gate 
will  cause  channel  inversion  (i.  e.  , a p region  will  form  beneath  the  gate 
oxide  and  provide  a conductive  path  between  source  and  drain).  Thus,  the 
aluminum  gate  must  overlap  both  source  and  drain  diffusions  by  a sufficient 
margin  to  ensure  that  mask  alignment  errors  do  not  result  in  the  channel 
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Figure  2.4-1.  P-MOS  device 
cross  section. 
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being  partially  uncovered.  This  would  have  the  result  that  part  of  the  channel 
could  not  be  inverted  and,  therefore,  the  device  would  be  rendered  useless. 

A basic  MOS  inverter  is  shown  in  Figure  2.4-2.  In  a P-MOS  circuit, 
the  load  device  is  a second  P-MOS  transistor  whose  gate  is  returned  to  a 

VGG  Power  suPPly  (where  VqG<  VDD<  VSS^'  Signal  levels  in  the  circuit 
switch  range  between  Vgg  and  When  the  input  equals  Vgg,  the  inverter 

transistor  is  cut  off,  and  the  output  is  pulled  to  through  the  load  device. 

If  the  input  changes  to  the  channel  of  the  inverter  transistor  inverts, 

forming  a p-region  between  source  and  drain  which  allows  current  to  flow 
from  the  load  device  to  Vgg,  thereby  pulling  the  output  to  Vgg. 

P-MOS  technology  has  various  shortcomings  that  limit  its  speed  and 
density,  including: 

1.  Low  gm/high  impedance.  Tolerances  in  photolithography  and 
lateral  diffusion  require  that  the  channel  length  be  greater  than 
approximately  0.  2 mil.  Clearly  gm  can  be  made  as  high  as 
desired  and  the  impedance  as  low  as  required  simply  by  lifting 
all  restrictions  on  gate  width.  However,  parasitic  caparitances 
are  increased,  as  discussed  below.  Moreover,  device  sizes  and 
component  densities  become  unacceptably  large  in  the  achieve- 
ment of  gm  greater  than  a few  hundred  micromhos  and  impe- 
dances less  than  a few  kilohms  . 

2.  High  threshold  voltage.  The  6-8  volt  thresholds  of  the  original 
P-MOS  require  logic  swings  and  Vp)D  supply  levels  of  12-  15  volts. 
Besides  requiring  special  power  supplies,  the  interface  between 
P-MOS  and  bipolar  logic  is  complicated. 

3.  Large  parasitic  capacitances.  The  Miller  capacitance  caused  by 
the  gate-to-drain  overlap  is  one  of  the  main  reasons  why  operat- 
ing rates  for  circuits  built  with  this  technology  are  limited  to 
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Figure  2.4-2.  MOS  inverter 
circuit. 
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approximately  200  kHz."  Other  limiting  capacitances  are  the 
drain -to -substrate  capacitance  and  the  channel  capacitance  which 
must  be  charged  before  inversion  can  occur.  The  low  resistivity 
substrate  used  in  P-MOS  technologies  causes  these  capacitances 
to  be  comparatively  large. 

Lessening  or  bypassing  these  shortcomings  has  beer  the  goal  of  all 
subsequent  MOS  technologies. 

Self- Aligned- P-MOS  --  Major  improvements  over  the  original  P-MOS 
are  realized  in  the  self-aligned  P-MOS  gate  structure.  A typical  cross 
section  is  shown  in  Figure  2.4-3.  As  in  original  P-MOS,  source  and  drain 
are  again  formed  by  a single  diffusion  processing  step.  In  tb  j "self-aligned 
structure,  however,  these  are  positioned  sufficiently  far  apart  to  assure  that 
the  gate  metalization  does  not  overlap  either  diffusion.  After  the  metalization 
has  been  effected,  boron  ions  are  implanted  to  bridge  the  gaps  between  the 
gate  and  Ihe  source  and  drain.  The  resulting  device  has  negligible  overlap 
capacitance  and  can  be  used  at  clock  rates  coproaching  20  MHz.  Moreover, 
a smaller  device  can  be  made  using  a self- aligned  technique  since  the  over- 
lap distance  need  no  longer  be  included  as  a part  of  the  channel  length. 
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Figure  2.4-3.  Self-aligned  P-MOS 
cross  section. 


' The  above  data  pertain  primarily  to  early  P-MOS  technologies.  With 
improved  processing  technology,  P-MOS  circuits  have  been  built  with 
V thresholds.  These  are  capable  of  running  at  1-2  MHz  clock  rates. 
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Another  improvement  achieved  using  ion  implantation  was  threshold 
voltage  reduction.  This  technique  was  utilized  after  it  was  found  that  the  high 
thresholds  of  the  early  devices  were  caused  by  positive  ions  trapped  in  the 
gate  oxide.  Subsequently,  it  was  determined  that  the  implantation  of  a pre- 
cisely controlled  amount  of  positive  dopant  into  the  channel  could  lower 
the  threshold  to  any  desired  voltage. 

N-MOS  --  N- channel  MOSFETs  we'e  considered  promising  from  the 
very  beginning  of  MOS  development  work  because  the  high  carrier  mobility 
of  n-type  material  (three  to  four  times  that  of  p-type)  was  known  to  result 
in  higher  transconductance  and  lower  resistance  for  a given  geometry.  How- 
ever, the  positive  ions  which  are  always  trapped  at  the  silicon/oxide  inter- 
face were  found  to  cause  the  channels  in  early  devices  to  invert  without  any 
applied  bias.  Worse  yet,  the  same  trapped  charge,  Qss>  caused  inversions 
in  regions  which  were  not  intended  to  be  channels,  so  that  all  diffusions  were 
effectively  interconnected. 

As  noted  above,  the  effect  of  the  trapped  charges  can  be  counteracted 
by  implanting  additional  positive  ions  into  the  silicon  substrate,  to  assure 
that  the  channel  remains  p-type  in  the  absence  of  applied  gate  voltage. 

Thus  ion  implantation  development  was  required  for  N-MOS  processing. 

The  cross  section  of  a typical  ion  implanted  N-MOS  device  is  shown  in 
Figure  2.  4-4. 


Figure  2.4-4.  Ion  implanted 
self-aligned  NMOS  gate. 


If  the  implanted  positive  ions  are  omitted  from  the  gates  of  the  load 
transistors  (cf. , Figure  2.4-2),  the  resulting  devices  may  be  used  with  their 
gates  and  sources  connected  together,  since  no  VGG  bias  is  required  to  keep 
the  loads  turned  on.  Such  depletion  mode"'  loads  are  used  in  N-MOS  to  per- 
mit a significant  size  reduction  relative  to  P-MOS.  This  reduction  stems 
both  from  the  smaller  size  of  MOS  de-ices  and  from  their  simplified  inter- 
connection requirements  achieved  by  their  use  of  depletion  loads.  More- 
over depletion  mode  loads  also  allow  faster  circuit  operation,  since  these 
load  devices  do  not  tend  to  turn  off  as  output  voltage  levels  approach  VDD* 

In  a recent  work2'4-11  a ring  oscillator  N-MOS  device  using  ion  implanta- 
tion with  a depletion  load  achieved  115  psec  propasation  delay  and  0.  29  pJoule 
power -delay  product,  using  very  small  geometries  and  high  substrate  doping 

(1  |j.m  channel  length). 

Silicon  Gate  N-MOS  --  A considerable  improvement  over  the  self- 
aligned  aluminum  gate  N-MOS  device  was  realized  in  the  silicon  gate  device 
as  suggested  by  the  cross  section  shown  in  Figure  2.4-5.  In  silicon  gate 
MOS  devices,  deposited  polycrystalline  silicon  is  used  instead  of  aluminum 
for  the  gate  material.  As  much  as  a 50  percent  reduction  in  IC  die  size  can 
can  be  realized  with  silicon  gate  circuits  since  the  polysilicon  can  be  used  as  an 
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Figure  2.4-5.  Silicon  gate  N-MOS  cross  section. 


' Depletion  mode  devices  require  an  applied  gate-to-source  voltage 
to  turn  them  off  (depleting  the  current  flow),  whereas  enhancement 
mode  devices  require  gate-to-source  voltage  to  turn  them  on. 
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additional  interconnection  layer  for  signals  as  well  as  allowing  the  source 
and  drain  connections  to  overlap  the  gate.  For  this  reasor,  silicon  gate  tech 
nology  is  used  widely  in  large  memory  arrays. 

CMOS  --  Complementary  MOS  devices  are  attractive  for  low  power 
applications  because  both  the  inverter  (n-channel)  and  load  (p-channel)  tran- 
sistors are  switched  by  the  input  signal  with  the  rjsu't  that  power  is  con- 
sumed essentially  only  during  switching.  Individual  CMOS  transistors  are 
similar  to  those  discussed  above,  except  that  N-MOS  and  P-MOS  devices 
must  be  separated  to  prevent  spurious  conduction  paths  from  forming 
between  parts  of  the  p-  and  n-channel  devices.  Heavily  doped  guard  rings 
or  channel  stop  diffusions  typically  are  placed  around  groups  of  devices 
of  each  type  to  achieve  this  separation.  A loss  in  circuit  density,  relative 
to  N-MOS,  results.  However,  a substantial  reduction  in  power  dissipai  on 
is  achieved,  particularly  for  those  logic  systems  that  are  operated  at 
fractions  of  the  highest  toggle  frequency.  Since  CMOS  represents  a viable 
contender  for  APSP  logic,  it  is  discussed  in  detail  in  section  Z.  6. 

CMOS  on  Sapphire  --  Considerable  developmental  effort  has  been 
aimed  at  producing  integrated  circuits  on  insulating  substrates  (either 
sapphire  or  spinel).  During  the  past  few  yea  's  reasonable  quality  silicon 
epitaxial  layers  on  these  substrates  have  been  produced,  into  which  MOS 
transistors  can  be  fabricated  as  separate  islands. 

Advantages  of  this  technology  include  greatly  reduced  capacitances 
between  active  elements  and  the  substrate,  as  well  as  improved  packing 
density  made  possible  by  the  simplified  isolation  between  devices.  Auto- 
doping of  the  epitaxial  layer  by  aluminum  ions  migrating  from  the  substrate 
has  been  a source  of  problems,  though,  as  has  the  higher  imperfection 

density  caused  by  mismatch  between  the  silicon  and  substrate  crystal 

. ...  2.4-3 

lattices. 

This  technology  has  been  used  at  Hughes  in  the  construction  of 
specialized  divider  circuits  which  operate  at  clock  rates  as  high  as  50  MHz. 
However,  the  low  transconductance  which  is  common  to  all  MOS  technologies 
(except  DMOS  variants)  as  discussed  above,  limits  the  maximum  frequency 
which  can  be  propagated  between  packaged  devices  to  approximately  20  MHz. 
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D-MOS  - • One  new  structure  that  exhibits  a higher  gm  and  b lower 
output  impedance  than  any  of  the  MOS  devices  considered  above  is  D-MOS  (or 
double -diffused  MOS).  As  shown  in  Figure  2.4-6,  an  extremely  short  channel 
is  obtained  by  first  diffusing  boron  into  the  source  region  and  then  diffusing 
phosphorus  through  the  same  oxide  opening.  In  this  process  diffusion  times 
are  carefully  controlled  so  that  the  boron  diffuses  laterally  about  0.  04  mil 
(lp)  further  than  the  phosphorus,  forming  a short  n region  extending  toward 
the  drain.  The  process  is  controlled  so  that  the  impurity  concentration  in  this 
region  will  be  high  enough  to  make  the  device  operate  in  an  enhancement  mode. 

The  remainder  of  the  distance  between  the  source  and  the  drain  con- 
sists of  very  high  resistivity  p type  material  which  is  inverted  by  charges 
trapped  in  the  oxide,  thereby  providing  a "drift"  region  that  serves  to 
increase  the  source  to  drain  breakdown  voltage  and  to  greatly  reduce  the 
Miller  capacitance.  However,  the  drift  region  is  also  affected  by  gate 
voltage  so  that  the  behavior  of  the  composite  device  must  be  modeled  as  a 

depletion  mode  transistor  in  series  with  the  enhancement  mode  inverter 

. 6.4-4 

transistor. 
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Figure  2.4-6.  D-MOS  cross  section. 
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When  used  in  logic  circuits,  the  D-MOS  device  can  be  treated  simply 
as  a very  good  N-channel  MOSFET.  Thus  D-MOS  devices  may  be  connected 
in  series  to  form  NAND  gates,  and/or  in  parallel  to  form  NOR  gates,  and 
any  of  the  load  devices  described  above  may  be  used  in  logic  circuits.  Some 
form  of  channel  stop  diffusion  is  required,  however,  in  DMOS  circuits 

because  of  their  very  light  substrate  doping. 

D-MOS  outperforms  all  of  the  standard  MOS  technologies  considered 

above.  Using  this  technology,  2.9  nS  gates  have  been  fabricated  and  an  exper- 
imental 11 -stage  ALU  which  exhibited  32  nS  total  delay  has  been 
realized.2,4"5  Since  D-MOS  represents  a potential  technology  for  the  APSP 

requirements  it  is  discussed  in  greater  detail  (see  2.7). 

V-MOS  - Another  new  short  channel  MOS  structure  is  V-MOS, 
shown  in  Figaro  2.4-7.  In  effect,  the  device  is  a vertical  (more  precisely, 
an  inclined)  D-MOS  transistor  with  the  substrate  as  its  source.  Desp.te  rts 
complex  appearance,  its  fabrication  involves  no  difficult  processing  steps, 
(reference  2.4-6)  with  the  possible  exception  of  the  nonstandard  anisotropic 

etch  required  to  form  the  four -sided  pyramidal  gate. 

Since  the  channel  extends  completely  around  the  gate  opening  and  is 

only  1 micron  long,  extremely  efficient  use  is  made  of  the  gate  area.  In 
fact,  it  has  been  reported  that  a V-MOS  device  was  made  with  approximately 
one-fifth  the  lateral  gate  area,  one-third  the  active  area,  one-half  the  gate 
capacitance,  and  twice  the  transconductance  of  a silicon  gate  N-MOS  device 
produced  to  the  same  tolerances  in  the  same  laboratory. 
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V-MOS  devices  have  been  used  in  an  experimental  counter  circuit 
that  toggles  at  33  MHz  and  in  an  inverter  chain  that  was  used  to  drive  two 
TTL  loads  at  60  MHz.  V-MOS  devices  are  likely  to  find  wide  application  in 
future  memories  but  are  inefficient  in  their  implementation  of  random  logic 
circuits.  This  is  true  because  ail  V-MOS  source  nodes  must  be  connected 
to  the  substrate,  thereby  limiting  these'  devices  to  use  in  implementing  only 
NOR  gates. 

2.4.3  CCD  Digital  Technology 

Charge  Coupled  Devices  (CCDs)  have  enjoyed  a tremendous  growth 
since  their  introduction  in  1970.  This  section  will  briefly  introduce  the 
operation  of  surface  and  buried  channel  CCDs  and  then  discuss  the  status  and 
future  of  CCDs  as  logic  devices  in  LSI. 

The  operation  of  CCDs  is  controlled  by  electrodes  that  cover  the 
surface  of  a glass  insulating  layer  on  the  silicon  substrate,  as  illustrated  in 
Figure  2.  4-8,  which  shows  a p type  substrate.  The  positive  electrode  voltage 
repels  majority  carriers  (holes)  in  the  silicon,  creating  a depletion  region 
of  negatively  charged  acceptor  sites  as  shown  in  Figure  2.4-9.  This  region 
extends  to  a depth  in  the  substrate  that  increases  with  magnitude  of  the  gate 
voltage.  When  minority  carriers  (electrons)  are  present,  they  tend  to  collect 
at  the  silicon-glass  interface  and  thus  decrease  the  surface  potential,  which 
also  reduces  the  extent  of  the  depletion  region.  The  empty  cell  condition  is, 
in  fact,  a non -equilibrium  condition,  since  minority  carriers  are  spontane- 
ously generated  by  thermal  effects  in  the  bulk  of  the  silicon,  this  is  usually 
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Figure  2.4-8.  Typical  cross  section  of  CCD 
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Figure  2.4-9.  Charge  distribution  in  a surface 
channel  CCD. 

called  "dark  current".  When  the  register  is  operating,  these  minority  car- 
riers are  swept  along  and  the  cell  is  cleared  at  every  transfer.  The  potential 
of  the  surface  of  the  p-type  silicon  is  at  a minimum  beneath  the  center  of  the 
bias  electrode.  For  a p-type  surface  channel  device  the  potential  varies  as 
shown  in  Figure  2.4-10.  The  electrons  are  stored  at  the  potential  minimum. 
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Figure  2.4-10.  Potential  distribution  in  a 
surface  channel  CCD. 
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In  a buried  channel  structure,  a layer  of  n doped  material  is  introduced 
on  the  surface  of  the  p silicon  by  epitaxial  or  ion  implantation  techniques,  A 
typical  BCCD  structure  is  shown  in  Figure  2.4-11. 

By  this  method,  the  potential  minimum  under  the  centerline  of  the 
electrode  is  moved  away  from  the  glas  s - silicon  surface  into  the  .1  type  donor 
layer.  The  typical  potential  distribution  and  the  location  of  the  stored  charge 
near  the  potential  minimum  are  shown  in  Figure  2.4-12.  In  the  absence  of 
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Figure  2.  4-12,  Energy  level  beneath  the 
centerline  of  an  electrode  in  a buried 
channel  CCD. 


stored  charge,  the  donor  layer  is  occupied  by  its  minority  carriers;  when 
there  is  a signal  packet  of  electrons,  donor  minority  carriers  are  present 
in  layers  extending  out  to  the  glass  interface  and  to  the  p-n  junction.  Thus, 
the  potential  minimum  remains  in  the  n layer  and  charges  are  transferred 
inside  this  layer. 

When  a positive  voltage  is  applied  to  the  electrode,  the  mobile 

majority  electrons  in  the  surface  layer  are  attracted,  leaving  bare  positively 

charged  donor  sites.  This  positive  region  acts  to  create  a depletion  region 

in  the  p type  silicon  substrate  which  is  quite  similar  to  the  distribution  within 

a surface  channel  CCD  substrate.  The  electric  field  gradient  goes  to  zero 

along  the  centerline  under  the  gate  electrode  at  a point  just  inside  the  doped 

layer;  this  locates  a potential  minimum  inside  the  layer,  along  with  injected 

signal  electrons  travel.  Since  there  are  usually  many  more  charge  trapping 

states  on  the  surface  of  silicon  semiconductors  than  inside  the  bulk  material, 

charr  e is  transferred  more  rapidly  and  more  completely  in  buried  channel 

devices.  Transfer  inefficiency  values  (fraction  of  charge  not  transferred) 
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of  10  to  10  are  readily  attained. 

In  the  surface  channel  device,  the  depth  of  the  depletion  region  of 
an  empty  well  is  a measure  of  the  total  well  capacity  for  charge  transfer. 

As  indicated  in  Figure  2.4-9,  the  depletion  "region  shrinks  in  depth  as  more 
charge  is  accumulated  un^  r the  control  elechrode.  When  the  depletion 
region  approaches  the  channel  surface,  the  maximum  bucket  capacity  has 
been  reached.  In  buried  channel  devices,  a similar  maximum  capacity 
restriction  occurs  when  the  charge  in  the  buried  channel  becomes  so  large 
that  the  potential  minimum  that  defines  the  well  spreads  out  to  the  surfaces 
of  the  buried  channel. 

The  charge  packet  is  moved  along  the  structure  of  the  CCD  by  an 
electric  field  which  is  swept  along  the  line  of  control  electrodes  by  applica- 
tion of  appropriately  phased  voltages.  It  should  be  noted  th'^t  the  difference 
in  level  of  the  electrodes  provides  the  step  in  surface  potential  needed  to 
force  the  charge  to  transfer  in  the  desired  direction.  Two-level  gate  elec- 
trodes must  be  used  in  one-  and  two-phase  clock  sequences,  anq  as  a result. 


overlapping  of  the  electrodes  is  required  to  reduce  the  effects  of  stray  fields. 
With  three-  or  four-phase  clocks,  the  step  in  surface  potential  can  be 
obtained  even  with  single  layer  electrodes.  Another  technique  of  transfer 
gating  is  to  modify  the  surface  doping  to  create  a potential  step  under  the 
electrode,  whi  h yields  a result  similar  to  that  of  a two-level  gate  electrode. 

The  input,  output,  and  control  transfer  gates  are  fabricated  of 
aluminum  or  doped  high  conductivity  polysilicon.  One  advantage  of  overlapping 
metal  gates  is  that  they  are  inherently  shielded  from  external  fields  and  have 
low  resistivity.  However,  polysilicon  is  easier  to  process  and  is  mechanically 
somewhat  more  compatible  with  the  glass  insulation  layers  which  are  used. 

Gate  structures  of  metal  overlapping  polysilicon  are  frequently  employed. 

The  advantages  of  the  two-level  overlapping  structure  are  its  shielding  and 
relative  simplicity.  The  single  layer  structure  is  susceptible  to  stray  fields 
and  requires  careful  control  of  the  gap  spacing. 

The  rapid  advance  in  the  speed  of  CCD  operation  offers  a future  pos- 
sibility of  a high  level  of  circuit  integration.,  The  maximum  frequency  of 
operation  of  a CCD  depends  on  its  detailed  design  and  transfer  efficiency 
required  for  a particular  application.  Furthermore,  since  the  maximum 
clock  frequency  depends  on  the  dimensions  of  the  CCD  in  the  direction  of 
charge  transfer,  the  capabilities  of  the  photolithographic  process  used  to 
fabricate  the  CCD  usually  limit  high  frequency  performance.  Conventional 
photolithographic  techniques  limit  the  minimum  dimensions  of  a device  to 
5 to  7.5  pm.  A high  resolution  projection  photolithographic  process  developed 
at  Hughes  allows  devices  with  minimum  dimensions  of  1.5  pm  (10  to  15  pm 
per  bit)  to  be  fabricated.  Estimated  high  frequency  limits  of  various  CCD 

structures  for  conventional  and  high  resolution  photolithography  are  indicated 
in  Table  2.4-1. 

To  make  effective  use  of  the  high  speed  capability  of  the  basic  CCD 
transfer  mechanisms,  fast  support  circuits,  probably  including  bipolar  or 
DMOS  devices  on  the  CCD  chip,  will  have  to  be  incorporated.  The  process 
technology  for  single  chip  combinations  of  bipolar  and  CCD  devices  has  been 
demonstrated  with  laboratory  and  experimental  devices  by  Hughes.  (Reference 
sections  4.5  and  4.  6 of  this  report. ) In  the  future  this  capability  and  the 
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TABLE  2.4-1.  MAXIMUM  CCD  CLOCK  FREQUENCY 


CCD  Structure 

Conventional 
Pho  tol  i tho  g r aphy , 
f , maximum, 
c MHz 

High  Resolution 
Photol  itho  g r aphy, 
f , maximum, 
c MHz 

Surface  channel 

10  to  20 

40  to  60 

Shallow  buried  channel 

20  to  30 

60  to  80 

Peristaltic  (deep  buried  channel) 

150  to  200 

300  to  400 

high  resolution  projection  photolithography  process  (which  was  developed 
for  the  fabrication  of  high  density  CCD  memories)  will  provide  operation 
above  300  MHz.  This  operation  will  occur  after  establishing  the  processes 
and  in  the  solution  of  other  problems,  including  the  technique  of  applying  small 
size  effective  on-chip  interconnects  between  the  bipolar  circuits  and  the  CCDs. 

A typical  block  diagram  of  a high  speed  digital  CCD  chip  is  shown 
in  Figure  2.4-13.  To  achieve  high  speed  performance,  the  cell  size  must 
be  small,  hence  control  gates  and  the  interconnection  lines  must  also  be 
small,  and  the  devices  must  be  packed  densely  on  the  chip.  The  mam  bus 


for  distribution  of  clock  signals  probably  will  have  to  be  fabricated  of 


aluminum  because  of  the  high  current  density 
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Figure  2.4-13.  Bipolar/CCD  shift  register. 
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The  control  electrodes  will  use  local  branch  connections  to  be  fabricated  with 
low  resistance  polysilicon;  because  of  the  small  geometry  and  close  spacing, 
the  resistance  of  the  polysilicon  leads  may  be  a design  factor. 

Although  CCDs  have  been  actively  developed  for  imaging,  analog 
signal  processing  and  digital  memories,  relatively  little  effort  has  been 
devoted  to  CCI  digital  logic.  The  Hughes  CRC  100  chip  to  be  tested  in 
early  1976  includes  two  unique  digital  adder  circuits.  The  very  low  power, 
high  density  and  relatively  high  frequency  possible  with  CCDs  suggests  that 
useful  CCD  logic  structures  maybe  possible.  The  logic  functions  are 
accomplished  by  transfer  of  charge  under  control  electrodes  which  are  over- 
lapped with  appropriate  transler  or  barrier  gates.  The  fundamental  oper- 
ation of  binary  arithmetic  for  single  digit  numbers  are  accomplished  by 
addition  of  charge  from  two  sources  into  a single  well  and  by  sensing  the 
overflow,  when  two  digits  are  simultaneously  present  behind  the  barrier 
gate.  Floating  electrodes  or  diffusions  can  be  used  to  non-destructively 
sense  the  resulting  charge  and  control  other  logic  barriers  to  provide  com- 
plex functions. 

Low  power  CCD  memories  have  been  steadily  advancing  (see  sec- 
tion 2.  3)  in  speed,  power  and  size.  However,  the  complexities  of  CCD 
logic  in  terms  of  clock  generation,  bias  and  control  voltage  inputs,  inter- 
connects, and  regeneration  and  output  circuitry  indicate  that  competition  for 
the  ultimate  low  power -delay  product  LSI  Digital  system  will  be  a difficult, 
uphill  battle  for  CCDs.  The  competitors  (I2L,  CMOS,  DMOS)  are  continu- 
ously reducing  their  a ize  (capacitance)  and  output  voltage  swing.  Not6  that 
one  generally  assumes  that  signal -to -noise  ratio  remains  approximately 
constant  as  output  voltage  swing  is  reduced,  since  digital  noise  is  primarily 
due  to  other  logic  devices  of  the  same  type  in  che  system.  However,  swings 
below  approximately  one  volt  do  begin  to  reduce  S/N  ratio  and  thereby 
complicate  the  tradeoff  between  noise  immunity  and  power  dissipation. 
Assuming  the  power  delay  product  of  the  1980'  s is  in  the  region  of  0.  1 pJ  for 
ring  oscillator  circuits  of  other  technologies,  CCDs  will  have  to  achieve  (for 
2.  0 volt  clocks)  0.0025  pf/bit  or  0.0125  mil2/bit,  or  very  efficient  reactive 
clock  generators  must  be  perfected.  In  addition  the  ratio  of  power -delay 
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performance  for  LSI  logic  functions  compared  to  ring  oscillator  circuits  is 
likely  to  be  considerably  larger  for  CCDs  than  for  other  technologies  due  to 
the  complication  of  clock  line  interconnects,  reset  requirements  and  output 
and  regeneration  devices. 

For  these  reasons  it  is  felt  that,  though  possible,  it  is  not  probable 
that  CCD  logic  will  be  in  the  running  for  the  lowest  power  delay,  highest 
packing  density  technology.  CCD's  appear  to  be  appropriate  in  some  cir- 
cuits (in  the  AVE  for  example)  to  perform  simple  logic  functions.  Also, 

CCD  A/D  and  D/A  converters  and  CCD  Serial  memories  do  not  have  the 
drawback  of  complex  peripheral  circuitry  requirements.  Hence  CCD  tech- 
nology has  a unique  place  in  shift  register  — like  applications  where  there 
are  no  competing  technologies  on  the  horizon.  The  APSP  system  configura- 
tion permits  these  features  of  CCD's  to  be  fully  exploited. 

2.4.4  Other  Logic  Technologies 

Logic  families  other  than  discussed  in  the  previous  paragraphs 
have  been  developed.  In  the  interest  of  clarity,  the  technologies  that  fall 
far  short  of  meeting  the  low  power  delay  product,  high  speed  and  high  density 
required  for  the  APSP  have  not  been  included.  Two  interesting  advanced 
technologies  however  are  worthy  of  consideration,  and  are  discussed  below. 

Transfer  Electron  Device  (TED)  Logic  - - The  fundamental  speed  of  a 
logic  circuit  in  LSI  is  an  essential  factor  in  determining  the  cost  effectiveness 
of  the  logic  circuit.  Although  not  presently  amenable  to  realization  in  ot.ier 
than  SSI  configurations,  gallium  arsenide  transfer  electron  devices  (TEDs ) 
can  be  operated  at  very  high  speeds.  Therefore,  TEDs  were  evaluated  in 
relation  to  their  possible  future  applications  in  low  cost  signal  processing, 
in  spite  of  this  limitation. 

TEDs  implement  Gunn  effect  electron  transport  which  permits  very 
high  performance  device  operation.  The  Gunn  effect  results  from  the 
fundamental  physical  parameters  of  gallium  arsenide.  In  this  regard,  its 
quantum  mechanical  energy- momentum  diagram  exhibits  satellite  valleys, 


in  addition  to  the  central  valley,  for  electron  transport  along  a vector  direc- 
tion determined  by  the  1-0-0  edges  of  the  first  Brillouin  zone.  The  "bottom" 
of  the  first  satellite  valley  is  located  at  a higher  energy  and  momentum  state 
than  the  "bottom"  of  the  central  valley.  However,  electron  mobility  is  much 
lower  (by  approximately  a factor  of  two)  at  the  satellite  valley.  Thus,  as 
the  electric  field  is  increased  along  the  appropriate  axis  of  the  crystal,  the 
electron  velocity  increases  and  then  decreases  abruptly  as  bunching  of  elec- 
trons occurs.  (The  formation  of  a high- field-concentration  domain  results. 
The  remainder  of  the  crystal  remains  in  the  state  of  higher  mobility.  ) The 
increased  voltage  drop  across  this  domain  reduces  the  potential  drop  along 
the  remainder  of  the  conduction  path  and  thus  a corresponding  overall  reduc- 
tion of  the  current  takes  place.  The  domain  travels  from  the  cathode  toward 
the  anode  where  it  is  absorbed.  Then  the  crystal  returns  to  its  initial  high 
current  condition,  and  if  the  electric  field  remains  constant,  a new  domain 
will  be  formed. 

The  build-up  and  annihilation  of  the  domain  occurs  very  rapidly  (in 
a few  picoseconds).  The  transit  velocity  of  the  domain  is  about  10  m/sec. 
Hence,  for  example,  a 10  micron  transit  path  will  correspond  to  production 

of  pulses  of  about  100  picosecond  duration,, 

The  circuit  technique  by  which  the.  Gunn  effect  is  utilized  for  logic 
operation  is  one  of  extracting  an  output  voltage  pulse  in  a series  resistance 
(either  an  external  element,  or  an  integral  part  of  the  device).  In  logic 
circuits,  the  diode  is  operated  below  the  threshold  field  required  for  domain 
formation,  and  a triggering  field  is  applied  by  a gate  electrode  placed  across 


the  path  of  current  flow.  A typical  device  is  shown  in  Figure  2.4-14. 

All  important  basic  logic  functions  can  be  realized  with  a single 
planar  Gunn  device  similar  to  the  structure  shown.  The  logical  AND  opera- 
tion can  be  accomplished  in  a structure  incorporating  a pair  of  control  gates 
for  which  critical  field  is  achieved  only  with  both  gates  activated.  The 
logical  OR  may  be  realized  by  a pair  of  gates  either  of  which  can  generate 
a critical  field.  Structures  for  exclusive-OR  (and  hence  inversion),  com- 
parator operations,  logical-carry  generation,  and  other  ingenious  arrange- 
ments have  been  devised.  These  functions  are  achieved  in  structures 
incorporating  multiple  cathodes,  or  in  structures  utilizing  lateral  domain 

,.  2.4-7, 8 

spreading. 
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Figure  2.4-14.  Gate  controlled  Gun  diode. 


Actually  to  achieve  logical  operations  suggested  above  in  working 
TEDs.  several  device  characteristics  must  be  considered.  These  include: 

1.  With  a threshold  device,  a certain  minimum  quantity  of  energy- 
momentum  is  required  for  switching. 


ssa — ams 


maam 
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2.  The  fundamental  TED  switching  time  is  shoit  because  the  electron 
transfer  mechanism  is  fast  (on  the  order  of  picoseconds).  As 

a result,  the  device  switching  time  is  determined  primarily  by 
parasitic  elements  in  the  external  circuitry. 

3.  The  output  pulse  shape  is  controlled  only  by  the  parameters  of 
the  device  and  input  pulse  shape  is  not  necessarily  related  to  the 
output  pulse  shape. 

4.  Any  additional  inputs,  after  the  device  has  switched,  cannot 
change  the  output  pulse,  e.g.,  a second  input  pulse  cannot  initiate 
a second  dipole  domain  until  the  first  has  been  extinguished. 

5.  The  pulse  propagation  velocity  is  constant  since  the  pulse  pro- 
pagates at  the  dipole  domain  velocity. 

In  addition  to  the  above  considerations,  there  are  fundamental  constraints  on 
device  material  parameters  and  form  factors.  As  reported  in  the  literature, 
the  doping  density,  n,  the  device  length,  i , and  the  device  thickness,  d,  have 
lower  limits  for  stable  domain  formation  given  by: 

nil  > 1013/cm^ 

nd  > 1012/cm2 


Thus,  fundamental  physical  properties  of  the  material  are  design 
constraints.  The  most  important  of  these  results  in  limitations  on  logic 
speed  and  reliability.  Maximum  logic  speed,  as  noted  (pulses  per  second) 
at  which  the  device  can  be  utilized,  is  primarily  determined  by  the  pulse 
propagation  time  (i.  e.  , the  time  during  which  no  additional  input  pulse  will 
cause  an  output  pulse).  This  pulse  propagation  time  is  determined  by  the 
domain  propagation  velocity  and  the  length  of  the  device.  Reliability  is 
primarily  associated  with  device  operating  temperature,  which  places  a 
restriction  on  the  power  dissipated  by  the  device.  Since  the  highest  power 
consumption  is  at  threshold,  the  threshold  power  must  be  minimized  suffi- 
ciently to  give  reliable  operation.  This  minimum  must  be  determined  some- 
what subjectively  until  reliability  testing  can  establish  the  appropriate 
failure  mechanisms  and  predict  time-temperature  relations. 


If  Gunn  effect  devices  are  to  be  competitive  with  other  logic  elements, 

they  should  be  usable  in  LSI  arrays.  Moreover  for  high  speed  operation,  the 

resistance  of  the  circuit  should  be  low  and  the  transit  time  of  the  domain 

from  the  gate  electrode  to  the  anode  should  be  as  short  as  possible.  These 

factors,  plus  the  fact  that  power  consumption  decreases  while  the  domain  is 

in  transit,  indicate  that  the  threshold  power  consumption  should  be  mini- 
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mized.  A recent  analysis  of  these  problems  * suggests  that  a figure  of 
merit  for  an  optimum  device  can  be  calculated  by  using  the  standard  con- 
straints on  doping  density,  dimensions  of  the  gate,  the  thickness  of  the  device. 
This  figure  of  merit  is  the  product  of  threshold  power  (P  x device  resistance 

(R  ) x transit  time  (t  ) and  may  be  evaluated  as 
o r 

PitR  t = 1.  1 x 1039  (V2/nJ)  sec/cm9 

th  o tr 


where 

V is  the  critical  threshold  voltage 
n is  the  doping  density 

If  the  doping  density  is  made  too  large,  impact  ionization  may  occur 

within  the  nucleated  domain;  therefore,  n cannot  be  increased  indefinitely. 

The  critical  voltage  cannot  be  decreased  arbitrarily.  Hence,  an  optimum 

set  of  design  parameters  can  be  determined  for  dimensions  and  doping 

densities  representing  reasonable  values  according  to  current  knowledge. 

1 £>  3 

For  a diode  notch  width  of  10  pm,  a doping  level  of  10  /cm  , and  a gate 
electrode  length  of  1 pm,  it  was  found  that  the  theoretical  value  of  minimum 

threshold  power  would  be  12  mW.  (Experimental  devices  have  exhibited 

* 

measured  dissipations  of  25  mW.  ) 

Gunn  effect  logic  offers  extremely  high  speed  of  operation  with  an 
apparent  capability  of  providing  all  the  necessary  logic  functions.  One  dis- 
advantage is  the  high  threshold  power  required  per  gate,  which  is  larger 
than  required  for  other  existing  or  projected  forms  of  logic  and  is,  in  fact, 
so  large  tiat  it  will  create  significant  thermal  problems  if  TEDs  are  used 

There  are  experimental  indications  that  dimensions  of  the  devices 
cannot  be  reduced  much  below  the  values  used  in  the  calculation 
above,  because  of  a "dead  zone"  effect  near  the  cathode  in  which 
relaxation  effects  alter  the  behavior  of  the  device. 
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in  LSI  high  density  arrays.  A secondary  problem  is  that  if  large  scale 
integration  is  not  practical  for  TEDs  then  significant  propagation  delay  will 
be  introduced  in  logic  circuits  in  inter- chip  connection  networks.  Thus,  the 
inherent  high  speed  advantages  of  the  Gunn  effect  will  be  lost.  In  view  of 
these  difficulties  and  the  lag  in  gallium  arsenide  technology  development  vis 
a vis  the  current  state  of  silicon  technology,  it  is  unlikely  that  TED  LSI  will 
become  available  in  the  foreseeable  future. 

Josephson  Junction  Logic  --  Josephson  junction  devices  operate  at 
superconducting  temperatures  (e.  g.  , 4.  2°K).  Like  Gunn  effect  logic  devices 
they  are  capable  of  gigahertz  frequency  operation.  In  addition,  however, 
Josephson  devices  have  the  apparent  advantage  of  very  low  power  dissipation. 
Power  delay  products  of  the  order  of  10"  17  joules  and  propagation  delays  in 
the  10-50  ps  range  have  been  exhibited  in  laboratory  test  circuits.  The 
basic  switching  circuit  is  fabricated  as  overlapping  thin  lead  films  separated 
by  pinhole- free  hyper- thin  (e.  g.  , 30  A)  oxide  tunnel  barriers.  The  junction 
is  mounted  on  a superconducting  metal  ground  plane  and  is  controlled  by  an 
overlaid  strip  of  superconducting  metal  which  carries  a control  current. 
Because  of  their  high  performance,  these  devices  are  being  evaluated  as 

alternatives  to  semiconductor  logic  circuits. 

Although  the  thermal  effect  of  power  dissipation  is  small  in  super- 
conducting devices,  Josephson  junction  logic  circuits  share  problems  common 
to  all  ultra-high  speed  logic  including: 

1.  Signal  delay  and  lead  inductance  in  inter -chip  connections 
increase  total  effective  delay  per  gating  stage. 

2.  Bias  voltage  must  be  provided  through  a very  low  series 
resistance  and  inductance. 

Over  and  above  these  common  difficulties,  the  superconducting 
Josephson  devices  have  additional  requirements. 

1.  New  techniques  must  be  developed  for  production  fabrication  of 
30  A thick  oxide  layers  over  large  areas. 

2.  New  compatible  superconducting  and  insulating  materials  and 
manufacturing  processes  must  be  developed. 

3.  New  design  rules  based  on  superconduction  processes  rather 
than  semiconductor  physics  must  be  devised. 
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4.  New  LSI  testing  techniques  must  be  developed  since  Josephson 
devices  operate  only  in  a liquid  helium  environment. 

On  the  assumption  that  all  the  above  difficulties  can  be  overcome, 
it  would  appear  that  Josephson  junction  logic  may  have  application  at  some 
time  in  the  future  in  large  ground  based  computing  systems.  Cryogenic 
cooling  is  impractical  for  large  spaceborne  signal  processing  systems  and, 
therefore,  it  is  not  likely  that  Josephson  logic  will  be  applied  as  long  as 
other  alternative  technologies  are  available  and  satisfy  the  APSP 
requirements . 

Z . 4 . 5 C omputing  Power  Concepts 

In  addition  to  power  delay  product,  two  characteristics  are 
important  inputs  for  digital  LSI  technology  trafeoff  studies,  gate  density 
and  computing  power.  Gate  density  obviously  effects  yield,  interconnect 
capacitance  and  single  chip  functional  complexity  (thus  overall  density  of 
required  output  buffering).  Computing  power  relates  to  what  can  be  done 
with  the  gates  or  cells  in  terms  of  logic  efficiency.  A key  characteristic 
of  ECL  logic  which  is  valuable  in  terms  of  computing  power  is  the  inherent 
complement  output  availability  in  the  individual  cell.  The  following  dis- 
cussion briefly  outlines  an  ECL  Universal  Logic  Gate  (ULG)  ' concept 

which  should  be  applicable  to  other  forms  of  logic.  The  concept  was 
developed  to  minimize  stages  in  very  high  speed  radar  signal  processing. 
The  important  message  is  that  through  optimization  of  logic,  improvements 
in  computing  power  per  gate  or  per  mm2  or  per  p joule  can  be  made.  Such 
studies  will  be  important  in  the  architecture  definitions  and  optimizations 
for  the  ATJSP. 

The  ULG  comprises  one -stage  arrays  of  two  identical  cascode 
circuits.  These  ULGs  realize  all  logic  functions  of  four  (and  fewer)  input 
variables  in  approximately  the  same  propagation  delay  as  a single  ECI 
current  switch  emitter  follower  (CSEF)  gate  fabricated  with  the  same 
processing  technology.  Substantial  power  and  power  delay  product  advan- 
tages relative  to  CSEF  arrays  have  been  demonstrated  using  comparable 
silicon  area  for  realization  of  all  four -input  functions.  The  ULG  was 
developed  for  implementing  logic  arrays  with  a minimum  number  of 
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gating  stages.  ULGs  permit  realization  of  logic  arrays  with  considerably 
improved  performance  which  is  achieved  because  logic  functions  can  be 
factored  or 'decomposed  very  efficiently  using  the  ULG. 

Reduction  of  the  propagation  delay  through  combinational  (gaving) 
arrays  usually  has  been  achieved  by  reducing  the  delay  of  the  individual 
gates  in  these  arrays.  Frequently,  gate  power  dissipation  is  increased  or 
transistor  performance  L improved  in  the  gates  to  achieve  this  reduction. 

In  conjunction  with  this  approach,  more  logically  efficient  gate  circuits 
can  be  built  and  used  in  arrays.  The  primary  objective  of  the  work 
referenced  herein  was  the  development  of  new  circuits  for  implementing 
logic  arrays  with  a minimum  number  of  serial  gating  stages. 

Secondary  goals  were  reduction  of  array  power  dissipation  and  silicon  area 
through  the  use  of  fewer  gate -building  blocks.  Since  the  LSI  circuits  were 
intended  for  use  in  very  high  speed  signal  processing,  only  bipolar  tech- 
nologies were  considered. 

A universal  logic  gate  (ULG)  is  a combinational  circuit  that  can  be 

"programmed"  to  realize  any  specified  function  of  its  input  variables.  A 

one -stage  ULG  realizes  any  specified  function  in  approximately  the  same 

propagation  delay  as  a conventional  gate  built  with  the  same  technology.  The 

single-stage  ULG  was  defined  based  on  a study  of  selected  switching  litera  - 
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ture.  ’ This  research  revealed  that  a ULG  having  the  largest  fan-in 

practical  (i.  e.  , three  or  four)  and  the  same  one-stage  propagation  delay 
as  available  (e.g.  , ECL),  gates  could  be  used  to  realize  logic  arrays  with  a 
minimum  number  of  stages.  In  a ULG  implementation  of  a specified  logic 
function,  further  reduction  in  the  number  of  gating  stages  in  worst  case 
art  ay  input/output  paths  can  be  achieved  only  by  increasing  gate  (ULG) 
fan-in.  ULG  fan-in  of  four  was  selected  as  a compromise  between  circuit 
complexity  of  the  ULG  and  potential  stage -delay  reduction  in  arrays. 

Minimum  gating  stage  logic  synthesis  with  ULGs  may  be  illustrated 
in  the  design  of  a 3 x 3 -bit  binary  multiplier.  A "conventional"  logic  design 
for  this  circuit,  suggested  in  Figure  2.4-15a,  requires  four-gating  stages 
in  forming  the  most  significant  bits  of  the  product. 
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A 2:1  reduction  of  the  delay  through  this  circuit  and  a correspondingly 
large  reduction  in  the  number  of  gating  blocks  are  possible.  These  reduc- 
tions may  be  achieved  by  factoring  the  logic  functions  for  the  individual 
multiplier  outputs  more  efficiently.  A minimum  delay  logic  partition 
obtained  for  the  two  most  significant  multiplier  outputs  is  shown  in 
Figure  2.  4- 15b.  (The  Karnaugh  map  in  each  block  designates  the  logical 
requirements  of  the  block.  ) Actual  realization  of  a minimum  delay  parti- 
tion in  general  requires  one -stage  logic  blocks  capable  of  realizing  any 
arbitrary  logic,  function  of  four  or  fewer  inputs,  i.  e.  , a one -stage  ULG. 

In  addition  to  illustrating  the  utility  of  a one -stage  ULG  in  reducing 
network  delay,  the  above  multiplier  design  also  suggests  the  advantages  of 
modular  ULG  construction.  In  this  regard,  it  is  noted  that  two  and  three - 
input  gates  would  be  used  along  with  four -input  ULGs  if  the  logic  partition 
of  Figure  2.4-15b  were  implemented  directly. 
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2.5  I2L  TECHNOLOGY 

2 

This  section  will  first  discuns  the  development  of  I L configurations 
and  the  significance  of  this  new  logic  to  bipolar  LSI.  Later  the  device 
limitations  inherent  in  the  basic  configuration  and  variations  of  the  circuit 
which  improve  performance  will  be  discussed.  An  attempt  will  be  made  to 
present  the  advantages  and  disadvantages  of  the  various  I2L  permutations 
in  tabular  form.  Finally,  the  future  of  I2L  will  be  discussed  based  on  the 
theoretical  limit  and  projected  state-of-the-art. 

2.  5.  1 I2L  Development 

Integrated  Injection  Logic  (I2L)  or  Merged  Transistor  Logic  (MTL) 
was  developed  in  an  effort  to  resolve  the  problems  associated  with  conven- 
tional bipolar  logic.  The  I2L  structure  can  be  derived  from  direct-coupled 
transistor  logic  (DCTL).  Figure  Z.  5-1  shows  the  evolution  from  DCTL  to 
I2L.  Figure  1(a)  shows  three  DC'TL  elements  connected  together,  1(d)  shows 
the  re-allocation  of  the  current  supplying  resistors.  Re-drawing  the  dotted 
portion  of  1(b)  and  replacing  the  two  output  transistors  with  a multi-collector 
transistor  results  in  1(c).  Because  of  the  large  area  required  for  1(c) 
resistors,  the  base  resistor  is  replaced  with  a PNP  current  source  resulting 
in  1(d).  Connecting  the  PNP  base  to  the  NPN  emitter  results  in  the  inte- 
grated injection/merged  transistor  concept  1(e).  The  entire  I2L  gate  can  be 
merged  into  one  fabrication  region.  Figure  2.  5-2  shows  the  resulting  I L 
structure. 

The  I2L  gate  consists  of  a lateral  P-N-P  transistor  as  a current 
source  and  a vertical  multicollector  N-P-N  transistor  as  an  inverter.  The 
term  integrated  injection  derives  from  the  fact  that  the  P-N-r  transistor  is 
considered  to  inject  current  into  the  N-P-N  transistor  and  is  in  fact  part  of 
the  N-P-N  structure  and  may  be  common  to  other  N-P-N  transistors  form- 
ing an  injector  strip  as  shown  in  Figure  2.  5-3. 
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(o)  DIRECT  COUPLED  TRANSISTOR  LOGIC 

(b)  RELOCATION  OF  RESISTORS 

(c)  EQUIVALENT  SINGLE  GATE  WITH  MULTIPLE 
COLLECTORS 

(d)  RESISTOR  REPLACED  WITH  CURRENT  SOURCE 
(o)  FINAL  12L  CONFIGURATION  WITH  CURRENT 

SOURCE  MERGED  WITH  INVERTER 


Figure  2.  5-1.  Bipolar  evolution  from  DCTL  to  I^L 
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2.  5.  2 I L Performance  Limitations 

I^L  is  a new  digital  circuit  technique,  not  a new  circuit  technology. 
l2L  can  be  manufactured  using  standard  processes  and  requires  only  4 masks 
and  2 diffusions,  P-MOS  being  the  only  device  requiring  fewer  steps. 

Because  of  its  unique  structure  a 4-mil-wide  I^L  gate  can  be  constructed  on  a 
5 sq  mil  area.  Because  of  the  small  area  the  parasitic  capacitances  are 
small.  Low  capacitances  along  with  a low  logic  voltage  swing  accounts  for 
the  very  low  power-delay  product  of  I^L. 

1^‘L  has  a constant  tower -delay  product  for  delays  from  1 ms  through 
approximately  100  ns,  the  propagation  delay  time  being  determined  pri- 
marily by  junction  and  parasitic  capacitances.  Minimum  propagation  delay 
is  limited  by  the  N-P-N  transition  frequency  Fp,  which  limits  the  basic  I^L 
gate  to  10-25  ns  delay  times.  The  device  limitations  inherent  to  I^L  are  a 
function  of  the  basic  structure. 

The  standard  bipolar  fabricating  technology  used  for  I^L  necessitates 
that  the  N-P-N  transistor  be  operated  in  the  inverse  mode.  The  upside  down 
fabrication  has  the  advantage  of  automatically  isolated  collectors  and  pro- 
vides common  emitters,  but  results  in  poor  inverse  current  gain  Pj,  and 
poor  transition  frequency  fp.  The  poor  current  gam  is  caused  mainly  by 
(1)  hole  injection  in  the  N-type  epitaxial  layer,  which  results  in  low  emitter 
efficiency,  and  by  (2)  base  resistance  for  the  multi-collector  structure. 

Poor  Fp  is  caused  by  the  retarding  field  in  the  base  and  the  injected  hole 
charge  in  the  epitaxial  layer. 

J^L  devices  have  been  constructed  using  the  basic  structure  with 
power-delay  products  of  0.  25  - 1 pj  per  gate  and  minimum  delays  of 
10-25  ns  with  densities  of  250  gates  per  mm^  with  5 g m details, 

I^L/MTL  noise  margins  can  be  defined  only  relative  to  the  magnitude 
of  the  lateral  injector  current-  equently,  the  absolute  noise  margin  in 

an  I^L/MTL  circuit  is  a func  on  of  he  value  of  the  externally  adjustable 
injector  current.  For  a given  injector  current  level,  however,  both 

turn-on  and  turn-off  noise  margins  may  be  defined  in  relation  to  the  circuits 
shown  in  Figure  2.  5-4.  As  suggested  in  Figure  2.  5-4a,  turn-on  noise  mar- 
gin can  be  defined  as  the  maximum  noise  current  Igj^  that  may  be  injected 
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Figure  2.  5-4.  Circuits  defining  I^L/MTL  noise  margin. 

at  a base  node  without  turning  on  a transistor  (T2),  which  would  otherwise 
remain  turned  off.  With  the  driving  transistor  (T  j)  itself  driven  by  current 
iBq,  the  maximum  collector  current  which  can  be  absorbed  by  this  tran- 
sistor (Tt)  is  given  by  PuIBO-  Thus  with  transistor  (T  l)»  collector  current 
lBO  supplied  by  the  lateral  injector  associated  with  the  off-transistor  (T2) 
transistor  T T will  absorb  additional  noise  load  current  up  to  a level  given  by 
XgNO  = PuIbq  " IbO*  Any  component  of  noise  current  above  this  level, 
however,  will  be  fed  into  the  base  of  the  off-transistor  (T2)  where  it  will  be 
amplified  and  will  result  in  a noise  output  current  ICN.  Thus  for  an  arbi- 
trary I5N  noise  current  level,  the  output  noise  signal  Iqn  will  be  given  oy 


'CN 


= ^u(IBO  + *SN 


" *Vbo* 


provided  that  Igj^  2 (Pu  - 1)IB0’ 

Turn  off  noise  margin  may  be  defined  similarly  in  relation  to  the 
circuit  of  Figure  2.5-4b  where  VT  = kT/q,  k is  Boltzman's  constant,  T is 
absolute  temp,  q is  electron  charge.  As  shown,  transistor  T-,,  nominally 
turned-on  in  the  absence  of  noise  current  will  begin  to  turn  off  if  Igjq  Is 
greater  than  I . In  this  case,  the  amplified  signal  noise  current  will  be 
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given  by  ICN  = Pu  (IBQ  - IgN).  The  critical  noise  voltage  corresponding  to 
an  allowed  output  I^N  current  value  is  given  by 


NS-CRIT 


VT  ^u^O^CN* 


VT  = kT/q,  where  k = Boltzman's  constant, 

T = absolute  temperature,  q = electron  charge 


The  results  above  suggested  that  the  I^L/MTL  noise  margin  at  a 
given  injector  current  level  is  probably  relatively  smaller  than  that  exhibited 
by  conventional  bipolar  saturated  logic  families.  The  impedance  levels 
within  an  I^L/MTL  circuit  are  also  lower  so  that  I^L/MTL  noise  margins 
are  adequate  to  ensure  proper  operation  in  the  presence  of  on-chip  noise. 
Moreover,  the  results  above  also  suggest  that  the  absolute  I L/MTL  noise 
margin  will  be  increased  as  a function  of  the  externally  controlled  injector 
current  level.  Noise  currents  (defined  as  the  difference  between  nominal 
and  actual  injector  current  levels)  will  also  increase  when  injection  levels 
are  increased  due  to  corresponding  increased  voltage  drops  in  the  distributed 
injector  emitter  network.  This  limitation  can  be  overcome  by  increasing  the 
size  of  some  of  the  injectors  as  appropriate,  in  an  LSI  array.  Such  injector 
s .ze  increases  might  also  be  used  in  any  event  to  provide  greater  noise  mar- 
gin in  I/O  interface  circuits.  Thus  I^L  might  exhibit  less  noise  immunity 
than  bipolar  saturated  logic  families,  but  geometry  corrections  may  offset 
this  advantage. 

An  interesting  variation  of  I2L  known  as  substrate  fed  logic  (SFL)has 
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been  demonstrated  by  the  Plessey  Company  Limited.  ’ SFL  provides 

improved  density  and  power -delay  product  at  the  expense  of  process  com- 
plexity. Substrate  fed  logic  is  configured  as  a vertical  N-P-N  transistor 
above  a vertical  P-N-P  injector  transistor  as  shown  in  Figure  2.  5-5.  The 
P type  substrate  is  the  P-N-P  emitter  and  is  connected  to  the  positive  supply. 
The  N epitaxial  layer  is  the  P-N-P  base  and  N-P-N  emitter  and  is  grounded. 
The  resistance  of  the  epitaxial  layer  is  reduced  by  a mesh  like  deep  N+ 
diffusion.  As  a result  the  entire  surface  is  available  for  logic  interconnec- 
tion. In  addition,  it  is  proposed  that  the  base  contact  could  be  replaced  with 
multiple  Schottky  barrier  diodes,  thus  providing  multiple  input  and  output 
devices  as  shown  in  Figure  2.5-6. 
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Figure  2.  5-6.  SFL  gate  circuit  with 
Schottky  barrier  input  diodes. 


An  experimental  device  was  fabricated  without  Schv.tky  input  diodes 
and  displayed  a constant  power-delay  product  of  less  than  0.  05  pj  with  delays 
of  110  ns,  with  a minimum  delay  of  =50  ns  per  gate.  SFL  represents  a 
significant  increase  in  fabrication  complexity  but  the  improvement  in  density 
and  performance  is  apparent. 

The  use  of  Schottky  diodes  has  been  proposed  in  various  other  con- 
figurations. The  use  of  Schottky  diodes  improves  the  electrical  character- 
istics of  I2L  by  reducing  logic  voltage  swings.  The  pow'  r-delay  product  is 
proportional  to  CV2  in  the  capacitance  limited  range.  Reducing  AY  increases 
speed  for  the  same  power.  In  standard  rL 

AV  = VBE(ON)  " VCE(SAT)  “ °*  75  ' ° = °*  75  V 


but  for  Schottky  I2L 


AV  = V 


BE(ON) 


VCE(SAT)  + VS(ON) 


= 0.75  - 


0 + 0.  45 


= 0.  3 V 


By  using! Schottky  diodes  to  decouple  the  output,  a single  collector  can  be  used 
which  reduces  the  high  inverse  gain  requirement  and  eliminates  base  resis- 
tance and  current  distribution  problems.  Schottky  diode  transistor  logic 
(SDTL)  can  be  implemented  as  Schottky  diode  decoupling  at  the  output 
(Figure  2.  5-7 a-)  of  the  ohmic  input  contacts  (Figure  2.5-7b).  Logically  and 
electrically  it  is  the  same  and  requires  the  same  area  for  a given  fan-in  or 
fan-out. 

The  use  of  another  Schottky  junction  across  the  collector  and  base  of 
the  inverter  transistor  has  been  proposed  and  called  complementary  constant 
current  logic,  C^L.  ' Different  barrier  heights  are  required  since  the 

logic  swing  is  the  difference  between  the  Schottky  clamp's  forward  voltage 
and  the  decoupling  diode's  forward  voltage.  The  proposed  device  uses 
titanium  for  the  decoupling  diodes  and  the  standard  combination  of  platinum 


Figure  2.5-7.  SDTL  gate  circuits  with  Schottky  diodes  at 
output  (a)  and  input  (b). 


silicon  for  the  clamp.  Because  of  the  fabrication  process  the  PNP  supply 
transistor  and  NPN  inverter  are  physically  separate  resulting  in  a lower 
density.  Figure  2.5-8  show  the  C3L  gate  schematically. 

Schottky  transistor  logic  (STL)  using  a PNM  (M  = metal)  Schottky 

pec 

transistor  and  Schottky  input  diodes  has  been  proposed.  * " Because  of  the 

advanced  fabric ation  technology  required,  an  experimental  device  has  not 
yet  been  constructed,  but  a discrete  device  simulation  was  constructed 


Figure  2.  5-8.  C3L  gate  circuit. 


which  showed  a 9 ns  reduction  in  gate  delay  at  100  p.A/gate  due  to  the  bchottky 
transistor.  Figure  2.  5-9a  shows  the  resulting  gate,  and  2.  5-9b  the  funda- 
mental structures. 

2.  5.  3 I2L  Status  and  Projections 

Table  2.5-1  shows  a comparison  of  the  existing  I L structures  and 
present  capabilities  along  with  the  advantages  and  disadvantages  of  each.  It 
should  be  noted  that  the  best  power-delay  products  are  achieved  at  delays 
of  100  ns  or  more  and  the  highest  speeds  shown  require  up  to  an  order  of 
magnitude  increase  in  power-delay  product.  It  is  also  important  to  consider 
that  much  of  the  experimental  data  available  in  the  literature  quotes  power- 
delay  products  and  maximum  speeds  based  on  ring  oscillators  which  are 
usurlly  single  collector  gates  operating  in  an  optimum  situation.  Normal 
combinational  logic  with  multiple  collectors  and  interconnects  would  be 
considerably  slower. 

The  future  improvements  in  I2L  will  be  influenced  mainly  by  advances 
in  process  technologies.  Ion  implantation  can  be  applied  to  design  a bipolar 
structure  more  adequate  for  I2L  circuits,  providing  higher  frequency 
response  and  gain.  Oxide  isolation  instead  of  isolation  can  be  used  to 
reduce  parasitic  capacitances. 

Electron  beam  technology  will  soon  provide  a factor  of  5 or  more 

2 

resolution  improvement.  It  is  reasonable  to  assume  that  I L will  reach 
power-delay  products  of  0.  001  - 0.  01  pj  and  speeds  in  the  1-2  ns  range 
within  the  next  decade,  as  a result  of  E-Beam  technology  and  advanced 
processes. 

I2L  is  at  present  the  highest  density,  lowest  power-delay  product 
bipolar  digital  logic  circuit  available  and  can  be  expected  to  dominate  the 
bipolar  LSI  field. 
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2.  6 CMOS  TECHNOLOGY 


As  discussed  in  paragraph  2.4.  2,  CMOS  combines  both  n-  and 
p-channel  MOS  devices  to  form  logic  inverter  stages.  The  n-channel  device 
has  a p-channel  device  as  its  load  and  vice  versa.  The  p-channel  MOST  can 
actively  pull  up  in  the  "1"  output  and  becomes  an  almost  infinite  resistance 
for  the  logic  "O"  output.  One  transistor  is  off  when  the  other  is  on,  allowing 
for  low  power  dissipation  in  either  state.  Since  both  devices  are  operating  as 
common  source  amplifiers  the  voltage  gain  in  the  active  region  (during  tran- 
sition) is  high.  Also,  the  time  to  switch  from  "1"  to  "O"  is  the  same  time 
it  takes  to  switch  from  "O"  to  "1",  giving  CMOS  a symmetry  and  generally 
higher  speed  than  either  PMOS  or  NMOS. 

Since  p-channel  and  n-channel  devices  require  different  substrates,  a 
special  process  involving  additional  steps  had  to  be  developed  to  construct 
CMOS  devices.  The  basic  wafer  or  substrate  is  an  n-type  material.  A p-type 
material  is  then  deeply  diffused  into  this  substrate  forming  a p-type  well. 
Heavily  doped  n+  regions  are  diffused  into  the  p-well  forming  an  n-channel 
device  and  similarly  doped  p+  regions  diffused  into  the  substrate  forming  a 
p-channel  device.  This  process  is  illustrated  in  Figure  2.  6-1. 

The  basic  CMOST  (CMOS  Transistor)  logic  circuit  is  the  inverter, 
which  uses  a complementary  set  of  devices  with  the  gates  tied  together.  Other 
logic  functions  built  on  this  basic  concept  are  shown  in  Figure  2.6-2. 
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Figure  2.6.  1.  CMOS  structure. 
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Figure  2.  6-2. 

CMOtS  logic  gates. 
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2.  6.  1 Primary  CMOS  Logic  Considerations 


In  a CMOS  gate  during  a logic  transition  the  devices  that  are  on  and 
those  that  are  off  can  each  be  modeled  as  one  equivalent  device.  Also,  from 
Swanson,  ' it  is  shown  that  series  MOS  transistors  can  be  modeled  as 
one  equivalent  device  whose  channel  width  is  the  sum  of  the  parallel  device 
widths.  Thus  each  CMOS  logic  gate  can  be  replaced  by  an  equivalent  inverter. 
It  is  this  equivalent  device  that  will  be  the  basis  for  analysis  of  CMOS  logic. 
Miathematicallv  since  the  gain  constant  of  an  n-channel  device  is 


K 


n 


C 

ox 


where 


Z = channel  width 
n 

Ln  = channel  length 

(j,n  = n channel  mobility 

C = oxide  capacitance 
ox  r 

then  two  devices  in  series  have  a gain  constant  of 


Z 

n — ^ 

■j — u C 
L nn  ox 
n 


and  for  two  in  parallel 


K 


C 

ox 


The  same  analysis  holds  for  p-channel  MOS. 
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A theoretical  development  of  the  speed-power  product  can  be  realized 
at  the  gate  level  using  the  equivalent  inverter.  Entire  logic  networks  can  he 
reduced  and  analyzed  as  gates  in  a similar  manner.  The  equation  defining 
the  speed-power  product  f c r CMOS  is  developed  by  Swanson  and  for  two  com- 
plete (positive  and  negative)  transitions  with  rise  time  tRi  and  fall  time  tf  1 
(at  the  output)  is 


ept  = ert  + eft 


C V 
VS 


1 


(2-1) 


where  ERT  and  EFT  are  the  individual  energies  for  rise  and  fall  times  equal 
to  the  total  energy  (speed-power  product)  EpT 

CT  = load  capacitance  on  the  equivalent  inverter 
L 

Vg  = power  supply  voltage 

t + td  are  the  transit  times  of  the  n and  p-channel  devices 
n H 


2CL 


ic. 


KPVS 


(3  = 1 - a - p 
r n 
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where  a = relative  threshold  voltage 


Thus,  the  entire  equation  can  be  reduced  (through  numerous  substitutions)  to 
one  in  terms  of  Cj  , K^,  Kp,  Vg,  and  V,^.  Where  and  Kp  are  dependent 
on  the  actual  physical  device  parameters  p,  C , channel  length,  and  channel 
width.  This  type  of  equation  can  certainly  be  written  into  a FORTRAN  pro- 
gram, and  a fairly  straightforward  power-speed  design  process  realized. 

The  calculation  of  the  load  capacitance  term  in  this  equation  also 
relates  directly  to  the  fanout  capability  of  the  equivalent  MOS  inverter.  This 
in  ’tself  is  an  important  design  consideration.  As  Cp  increases,  the  speed- 
power  product  increases.  Figure  2.6-3  illustrates  the  capacitances  that 

contribute  to  the  load  capacitance  term.  C.  and  C.  are  the  p and  n chan- 

JP  J,n 

nel  drain -junction  to  substrate  capacitance  and  C^.  in  the  wiring  (metali- 
zation  interconnect)  capacitance.  The  development  of  the  load  capacitance 
equation  is  fairly  straightforward  except  fo^  the  fact  that  during  a nega- 
tive output  transition  the  p-channel  devices  connected  to  the  output  node 


v. 


Figure  2.  6-3.  Various  capacitances 
connected  to  the  output  node  of  an 
equivalent  CMOS  inverter. 

are  in  their  saturation  region  and  the  n-channel  MOST's  are  in  their  linear 
region  and  vice  versa  for  positive  transitions.  Since  different  capacitances 
appear  in  these  two  regions  the  load  capacitance  must  be  separated  into  two 
distinct  values  for  + and  - going  transitions.  Referring  to  these  as  CT+  and 
Cj^  the  overall  load  capacitance  equation  is 


= C.  + C.  + C...  + (C.  + C,)*  + V 
JP  jn  W 1 2'eq  ^ 


(c4i  + cRi) 


± 

5i^eq 


where  the  summation  is  over  i,  the  number  of  gates  to  which  the  output  Is 
connected.  Also  (Cj  + and  (C^.  + C^. are  equivalent  capacitances 

dependent  on  delay,  rise  and  fall  times  associated  with  the  respective 
capacitances  (C^,  C^,  C^,  C^.)  during  + and  - going  transitions.  This  is 
because  C^,  C^,  and  C,-  are  connected  to  time  varying  voltage  nodes  and 
represent  the  Miller  feedback  effect. 
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From  the  previous  discussion  it  can  now  be  seen  that  equation  (2-1) 
is  extremely  useful  as  a design  or  analysis  tool  for  CMOS  logic  circuits. 

The  equation  basically  finds  the  speed-power  product  of  an  equivalent  CMOS 
logic  gate,  which  in  itself  is  extremely  important.  But  on  the  way  to  this 
result  other  considerations  are  found.  The  load  capacitance  indicates  fanout 
capabilities  and  limitations  through  the  analysis  discussion  above.  Circuit 
packing  density  is  also  taken  into  account  as  the  transit  times  t and  T are 

n P 

inversely  proportional  to  the  n and  p channel  gain  constants  K + K . These 
in  turn  are  directly  related  to  device  channel  width  and  length,  the  limiting 
factors  in  MOS  chip  real  estate.  Some  design  considerations  regarding 
circuit  power  supply  levels  are  related  through  the  Vs  term  in  equation  (2-1) 
as  well. 

Equation  2-1  would  indicate  that  with  sufficiently  small  supply  voltages 
and  geometries,  the  fundamental  logic  limitations  for  power -delay  can  be 
overcome.  Such  devices  require  extremely  small  geometries.  Swanson 
derives  further  restrictions  on  small  MOS  logic  through  a two  dimensional 
analysis,  the  results  of  which  are  shown  in  Figure  2.  6-6.  Noise  considera- 
tions in  CMOS  circuits  as  in  other  logic  families  are  fairly  complex.  1/f 

p Z p 

considerations  are  developed  for  MOS  transistors  by  Backensto  and 

can  be  extended  to  CMOS  devices.  Of  prime  importance  at  higher  device 
operating  frequencies  above  the  1/f  corner  is  thermal  noise  defined  as 

1 8 JctV^2 

Vrh  “ \3  gm f 


where 


W 1/2 

g = (2  H C I_)  ' C.  (2-2) 

6m  L ^p  ox  D'  ' ' 

From  equation  (2-1)  it  is  noted  that  shorter  channel  lengths  are  needed 
for  better  power-speed  products.  This  will  keep  device  thermal  noise  low. 
However,  of  prime  importance  in  digital  logic  circuits  is  external  noise 
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sources  and  how  much  of  this  can  appear  on  the  input  of  an  inverter  without 
flipping  the  output  from  one  state  to  another.  The  noise  immunity  of  a logic 
circuit  family  is  strongly  dependent  on  its  speed.  Noise  is  capacitivti^ 
coupled  on  to  signal  lines,  thus  slower  logic  response  means  better  noise 
immunity.  CMOS  typically  rejects  voltage  noise  pulses  up  to  45  percent  of 
the  power  supply  voltage  and  30  percent  is  guaranteed  in  commercial  devices 

2 ^ 3 

throughout  the  industry. 

Since  signal  processing  in  proposed  infrared  sensor  systems  involves 
MOS -CCD  multiplexing  and  A/D  schemes,  CMOS  logic  gives  an  added 
option  of  having  the  logic  compatible  with  the  signal  processing  that  comes 
before  it. 

Numerous  design  and  analysis  techni  lues  enable  CMOS  to  be  incor- 
porated well  into  LSI  circuits.  Swanson's  equivalent  inverter  and  scaling 
techniques  discussed  by  Keyes  ^ are  two  methods  towards  intelligent 
computer  aided  techniques.  Integrated  circuit  analysis  programs  such  as 
SPICE  are  well  developed  and  directly  applicable  to  CMOS  equivalent  circuits. 
A Hughes  developed  program  called  ANYMOS  specifically  accommodates 
design  parameters  encountered  in  CMOS  logic. 

2.6.2  CMOS  Status  and  Projections 

Present  day  CMOS  logic  devices  are  well  established  in  the  com- 
mercial market.  Virtually  all  of  the  functions  obtainable  by  standard  TTL 
logic  are  now  available  in  CMOS.  However  custom  designs  must  be  imple- 
mented to  obtain  the  highest  levels  of  performance. 

Most  commercial  CMOS  devices  are  constructed  on  a bulk  silicon 
substrate  using  the  standard  metal -oxide -semiconductor  processing  tech- 
niques. The  various  commercial  processes  are  optimized  with  regards  to 
yield,  speed,  power,  and  density.  The  limits  are  not  necessarily  optimum 
due  to  the  heavy  influence  of  cost.  For  a typical  CMOS  inverter  operating 
at  5 volts,  ambient  temperature  and  a load  capacitance  of  25  pf  the  power- 
speed  product  is  about  800  pJ.  " This  number  is  not  a good  figure  of 
merit  for  LSI  circuits  since  interconnect  capacitances  are  very  much 
lower.  However,  it  does  give  some  feel  as  to  where  discrete  commercial 
CMOS  is  and  the  significance  of  large  scale  integration. 
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Of  particular  importance  towards  the  implementation  of  high  speed-low 
power  circuits  is  the  advent  of  an  insulated  substrate  CblOS  process  using 
sapphire.  The  CMOS /SOS  (SOS  for  silicon  on  sapphire)  greatly  reduces  the 
junction  to  substrate  capacitances  involved  in  tne  input  gate  capacitance. 

This  allows  for  an  approximate  3:1  reduction  in  power-speed  product  over 
bulk  silicon  substrates  since 


E 


PT 


power  x spee 


C V 

ll  s 


This  reduction  refers  to  LSI  circuits  with  minimum  geometry  devices.  In 
addition,  polysilicon  (self-aligned)  gates  may  be  used  to  reduce  gate  overlap 
capacitance  and  lower  threshold  voltage.  This  allows  higher  fanout,  lower 
power  dissipation,  arid  higher  speed.  The  SOS  process  is  discussed  further 

in  par  agr  aph  4.3. 

Future  trends  in  CMOS  seem  to  point  in  certain  definite  directions. 
Certainly  CMOS/SOS  and  polysilicon  gate  techniques  will  be  pursued  further 
for  speed-power  product  reduction.  Because  the  sapphire  process  is  some- 
what expensive,  some  performance  tradeoffs  may  have  to  made,  though,  and 
bulk  silicon  CMOS  will  continue  in  the  commercial  market. 

Shorter  channel  lengths  and  reduction  of  capacity  in  the  channel  are 
certain  to  come  as  experience  in  dealing  with  small  geometry  devices  grows. 
Hughes  Newport  Beach  has  constructed  CMOS  devices  with  0.  2 mil  p-channel 
and  0.  3 mil  n-channel  lengths  as  well  as  experimental  devices  down  to 
0.  1 mil  in  channel  length.  Capacitance  values  are  projected  to  reduce  from 
0.  2 to  0.  1 pf/mUA  These  two  factors  in  themselves  combine  for  good  future 
speed-power  reductions. 

An  interesting  topic  that  could  well  prove  advantageous  is  the  concept  of 
ion-implanted,  buried  channel  CMOS.  Work  has  been  done  to  characterize 
buried  channel  MOSFET's2,  6-9  and  to  some  extent  CMOS,  2*  6"  10  and  further 
investigation  seems  worthwhile. 

A brief  explanation  of  the  argument  for  buried  channel  CMOS  is 
appropriate.  Today,  CMOS  logic  is  operated  in  or  near  weak  inversion  (see 
Figure  2.  6-4)  in  order  to  maintain  low  power.  However,  operation  in  weak 
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inversion  causes  a small  effective  mobility,  as  Iq  is  decreased,  in  the 
channel,  thus  limiting  the  speed-power  product.  For  a buried  channel  device 
weak  inversion  is  essentially  a meaningless  term,  as  the  inversion  is  already 
present  (see  energy  band  diagram,  Figure  2.  6-5).  Thus,  attempting  to 
better  the  speed-power  product  by  lowering  Ip  does  not  decrease  the  effective 
mobility  as  much  as  in  a surface  channel  device.  The  higher  mobility  in  a 
buried  channel  device  decreases  the  speed-power  product.  In  addition  ion 
implantation  techniques  shift  the  threshold  voltage  such  that  the  power-speed 
product  is  reduced  further. 

Improved  photolithographic  techniques  are  necessary  to  achieve  the 
high  density,  lower  power-speed  devices  expected  in  the  1980's.  The 
increased  accuracy  obtained  by  improving  this  technique  would  now  allow 
CMOS  to  approach  the  ultimate  inverter  limitations  derived  by  Swanson. 

These  are  plotted  in  Figure  2.6-6.  The  following  conditions  are  necessary 
to  obtain  the  best  possible  performance: 

L = 500  A 

Vg  = 0.  IV 

TpD  (pair  delay)  = 7.5  psec 

Power  = 20  nW 

t = 100  A 
ox 

with  these  conditions  EpT  = 1.  5 x 10_19J.  A low  voltage  ring  oscillator  was 
fabricated  by  Swanson  using  ion  implantation  techniques  and  achieved  a speed 
power  product  of  0.  08  pJ  with  a supply  voltaga  of  0.  4V . These  numbers 
approximately  reflect  future  trends  of  CMOS. 

One  last  point  with  regards  to  the  use  of  CMOS  logic  in  space 
censor  systems  is  radiation  hardening.  The  Hughes'  Newport  Beach  facility 
is  extensively  involved  with  radiation  hardening  and  testing  of  CMOS.  By 
ADSP  time  frame  considerable  valuable  experience  and  process  knowledge  in 
this  area  will  have  been  obtained  and  could  be  used  to  the  benefit  of  APSF. 
Since  APSP  operates  in  a benign  environment,  (typically  at  104  rads,  oi  ly  a 
few  tenths  of  a volt  threshold  shift  occurs)  this  knowledge  is  not  critical,  but 
would  be  useful  in  reducing  supply  voltages  below  approximately  1-1/2  to 
2 volts.  MOS  devices  are  majority  carrier  technologies  and  do  not  (as  bipolar 
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Figure  2.6-4.  Inversion  regions  in 
a MOSFET. 


Figure  2.6-5.  Band  diagram  of  MOST  structure  with 
implanted  layer  N.  beneath  the  substrate  (N.>N_). 
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Figure  2.6-6.  Maximum  possible  performance  of  CMOS  inverters 
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I L)  exhibit  the  loss  in  gain  due  to  increased  recombination  rates  caused  by 
total  Y radiation  dose.  This  effect  in  bipolar  base  regions  degrades  (3.  *, 
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2.  7 DMOS 

The  DMOS  technology  not  only  multiplies  MOS  speed  but  opens  up  the 

entire  realm  of  microwave  applications  to  MOS  technology.  It  involves  a 

two-stage  diffusion  through  a single  mask  opening,  permitting  channels  of 
1-rnicron  lengths  to  be  formed  simply  and  inexpensively. 

The  result:  discrete  microwave  transistors  that  exhibit  a 

10-gigahertz  maximum  frequency  of  oscillation,  a 7-decibel  gain  at  2 GHz, 

and  a noise  figure  of  0.  5 dB  at  1 GHz  - performance  usually  associated  only 

with  bipolar  devices.  DMOS  devices  can  be  switched  with  subnanosecond 

speeds,  and  have  the  added  advantage  of  high  breakdown  voltage. 
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Microwave  FETs  could  only  be  obtained  with  a channel  length  of 


about  1 micron,  which  requires  1 micron  metal  gate  widths.  Put  to  manu- 
facture devices  with  such  narrow  gates  it  has  been  necessary  to  use  sophis- 
ticated, highly  accurate  photomasking  techniques.  Moreover,  for  digital 
ICs,  high  speed  is  also  dependent  on  highly  controlled  doping  and  diffusion 
steps.  The  commercial  DMOS  process,  on  the  other  hand,  achieves  its 
micron-long  channels  with  metal  widths  and  oxide  openings  no  smaller  than 

8 microns,  demanding  photomasking  and  diffusion  tolerances  no  stricter 

2.7-1 

than  those  presently  used  in  conventional  bipolar  IC  technology. 

The  high  performance  ;f  DMOS  devices  results  directly  from  the 
method  of  forming  the  channel  as  discussed  in  2.4.  2.  In  general,  the  fre- 
quency response,  or  speed,  of  any  FlOS  transistor  is  determined  primarily 
by  channel  length  and  parasitic  capacitance,  and  improves  as  they  become 


smaller.  Reducing  length  cuts  the  transit  time  for  carriers  traveling  between 

source  and  drain,  while  reducing  capacitance  decreases  the  charging  time. 

(In  an  MOS  device,  parasitic  feedback  capacitance  exists  between  gate  and 

drain,  C , as  well  as  between  gate  and  source,  C . ) 
g d o & 

In  the  usual  MOS  devices,  unfortunately,  a short  channel  length 
usually  entails  large  parasitic  capacitance,  because  the  separation  between 
source  and  drain  determines  the  amount  of  lateral  diffusion  under  the  gate  for 
a given  L,  and  the  lateral  diffused  region,  which  is  also  highly  doped,  repre- 
sents large  C or  C ,.  These  parasitics  can  be  minimized  by  ion 
& gs  gd 

implantation  and  polysilicon  gate  processes,  which  are  self -aligning  and 
reduce  the  overlap  between  the  gate  region  and  the  source  and  drain  regions. 

The  DMOS  process  eliminates  these  problems,  and  results  in  a 
device  which  has  a precisely  controlled  channel  length  uf  less  than  1 micron, 
minimal  C g,  very  small  feedback  capacitance  C^,  and  no  restriction  on 
maximum  drain  breakdown  voltage.  As  Figure  2.7-1  illustrti.es,  the  DMOS 
channel  is  a narrow  region,  sandwiched  between  two  opposite -type  regions  and 
created  by  the  sequential  diffusion  of  two  opposite -type  regions  and  created 
by  the  sequential  diffusion  of  two  opposite -type  dopant  impurities  under  a 
single  mask  edge  in  the  source  region.  Once  this  edge  is  formed,  the 
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Figure  2.7-1.  N channel  DMOS  with 
N channel  depletion  load  forming  LSI 
inverter  stage. 

critical  distance  L is  a function  only  of  the  diffusion  schedule -masking, 
exposure  and  etching  errors  are  eliminated -and  since  one  diffusion  edge 
follows  the  other,  L can  he  controlled  in  much  the  same  way  as  is  the  base 
width  of  a bipolar  transistor. 

Lateral  diffusion  also  controls  channel  length  in  the  conventional 
process.  But  there  it  leads  to  large  overlap  parasitic  capacitance  when 
narrow  channels  are  required.  For  the  DMOS  process,  the  diffusion  lengths 
are  much  smaller,  even  for  very  narrow  channels,  and  because  overlap 
capacitance  only  exists  on  the  source  side  for  DMOS,  it  can  be  kept  to  a 
minimum. 

Channel  lengths  in  the  range  of  0.  4 to  2 microns  are  easily  achieved 

in  DMOS  processing,  even  when  the  relatively  noncritical  bipolar  diffusion 

schedules  are  used.  In  contrast,  because  the  variations  in  mask  quality, 

exposure  and  etching  can  affect  channel  length  by  1 micron  or  more,  typical 

MOS  transistors  in  production  today  are  fabricated  with  5 -micron  lengths, 

and  for  this  reason  are  unable  to  compete  in  speed  with  bipolar  devices. 

Integrating  a DMOST  driver  with  a depletion  MOS  load  transistor 

provides  an  exceptionally  high  speed  and  low  power  (by  virtue  of  the  low 

2.7-2 

supply  voltage)  LSI  gate  structure  (see  Figure  2.7-1).  ' DMOS  has  a 

high  gain  factor  due  to  the  short  effective  channel  length.  The  threshold 


2-95 


voltage  of  the  DMOS  can  be  precisely  controlled  using  self  aligned  diffusions. 
Drain  capacitance  can  presently  be  made  - 1/4  times  as  low  as  that  of  con- 
ventional NMOS.  The  depletion  load  has  efficient  driving  capability  due  to 
its  constant  current  characteristics. 

A 0.  65  nsec  propagation  delay  time  for  an  integrated  ring  oscillator 

2.7-2 

circuit  has  recently  been  demonstrated,  * with  a power -delay  product 

2 

of  0.  10  pJ  using  a 2 volt  supply  voltage.  A 510  gate /mm  density  was 
obtained. 

In  the  same  work  a 4-bit  arithmetic  logic  unit  (ALU)  with  48  opera- 
tions was  developed  and  obtained  2.  9 nsec/gate  and  2.  0 pJ  power -delay  with 
2 

141  gates /mm  operating  with  a 5 volt  power  supply. 

Table  2.7-1  is  taken  from  reference  2.7-2.  It  shows  a direct  com- 

2 

parison  between  DMOS,  ELC  and  T L for  similar  ALU  LSI.  functions. 

DMOS  is  superior  in  speed-power  product  and  packing  density  and  comparable 
to  ECL  in  speed. 


TABLE  2.  7-  1.  COMPARATIVE  ALU  CHARACTERISTICS 


DMOS 

ECL 

TTL 

tpd  per  gate 

(ns) 

2.  9 

1.  5 

6.  0 

Power  dis  ipation  per  gate 

(mW) 

0.  71 

15 

8 

p • t ^ product  per  gate 

(pJ) 

2.  1 

22.  5 

48 

transition  time  (Including 
buffer) 

(ns) 

32 

( 1 1 Stages) 

6.  5 

(4  Stages) 

24 

(4  Stages) 

Number  of  logic  stages 

9 ~ 12 

4 ~ 6 

4-6 

Number  of  logic  gates 

115 

86 

64  - 87 

Chip  area 

(mm  2) 

0.  8 

3 

7.  3 

Supply  voltage 

(V) 

5 

-5.  2 

5 

Logic  swing 

(V) 

4 

0.  8 

3 . 3 

Device  count 

381 

— 

632 

143 
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It  appears  that  with  future  refinements  in  photolithographic  techniques 
and  DMOS  on  sapphire  technology  or  even  Complementary  DM  OS  on  sapphire 
with  low  threshold  voltages  (power  supplies  in  the  1 to  2 volt  area),  another 
order  of  magnitude  may  be  obtained  in  power -delay  product  with  subnano- 
second gate  delays  for  full  LSI  circuits.  DMOS  and  related  technologies  have 
a promising  future. 

2.7  REFERENCES 

2.7- 1  Cauge,  T.  P,  et  al,  "Double-diffused  MOS  transistor  achieves 

microwave  gain,"  Electronics,  February  15,  1971. 

2.  7-2  Ohta,  K.  , et  al.  , "A  High-Speed  Logic  LSI  Using  Diffusion  Self- 
Aligned  Enhancement  Depletion  MOS  IC,  " IEEE  Journal  of  Solid 
State  Circuits,  Vol  SC- 10,  No.  5,  Oct.  1975. 

2.7- 3  Hughes  Aircraft  Co,  "Low  Cost  Real  Time  Processor  for  SAR 

Systems,  Phase  1 Report,  Contract  No.  F336 15 -74-C- 1 1 67, 

May  1975. 


3.  0 ADAPTIVE  VIDEO  ENCODER  TECHNOLOGY 


The  Adaptive  Video  Encoder  of  the  APSP  accomplishes  the  interface 
between  the  digital  Layered  Array  Processor  (LAP)  and  the  analog  detector 
signals  from  the  Monolithic  Focal  Plan  Array  (MFPA).  Included  are  the 
Moving  Target  Indication  Estimator  and  video  temporal  prefilters.  In  addi- 
tion, the  AVE  controls  detector  bias  and  clock  frequencies  on  the  MFPA, 
based  upon  peak  signal  strength,  to  adaptively  optimize  dynamic  range. 

Thus  the  APSP,  in  addition  to  needing  high  speed  logic  in  the  LAP,  requires 
A/D  converters  (8  bit),  D/A  converters  (14  bit,  in  some  feedback  systems 
under  consideration),  counters,  differential  amplifiers,  memory,  and  logic 
elements  (gates,  adders,  multipliers)  for  the  AVE.  The  basic  word  rate 
requirements  vary  from  16.  4K  words/sec  to  164K  words/sec.  This  section 
investigates  the  present  state  of  the  art  of  the  most  critical  of  these  devices; 
the  A/D  and  D/A  converters.  Detailed  descriptions  of  the  AVE  and  other 
new  devices  required  for  APSP  implementation  are  the  subject  of  a forth- 
coming report:  Critical  Device  Design. 

A company  funded  development  currently  under  way  at  Hughes  is  the 
CRC  100  signal  processing  CCD  chip.  The  devices  on  this  chip  are  specifically 
tailored  to  A/D,  D/A  and  digital  logic  requirements  similar  to  those  of  the 
AVE.  Table  3.  0-1  lists  the  specific  devices  and  systems  on  the  test  chip. 

In  general,  a variety  of  functional  components  make  up  each  integrated  CCD 
A/D  system.  Connection  pads  are  provided  to  allow  component  testing, 
characterization  and  optimization  in  addition  to  system  operation.  With  its 
four  A/D's  and  other  devices,  the  chip  provides: 

• A/D  Division  Elements  - Z types 

• Comparators  - 4 types 
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Logic  Elements 
D/A  Converters 

Sample  and  Hold,  Peak  Detector,  etc. 
CMOS 


- 3 types 

- 3 types 

- 4 types 

- 2 types 


Ti,e  Critical  Device  Design  document  will  provide  a closer 
examination  of  the  individual  c.rcuits.  In  general,  for  each  A/D  system. 

TABLE  3.0-1.  HUGHES  CRC-100  CCD  TEST  CHIP 
(N  TYPE  SURFACE  CHANNEL) 


Device 

Number 


Circuit/ Function 


Forward  A/D  System 


Feedback  A/D  System 


CCD  Half  Adder 


CCD  Full  Adder 


CCD  D/A  Converter 


Forward  Differential 
A/D  system 

Multiplying  Feedback 
A/D  System 


Test  Devices 


Description 


Serial  output  successive 
approximation  A/D  system  with 
charge  comparator 

Serial  output  successive 
approximation  A/D  system  using 
digital  feedback  to  D/ A 

Half  Adder  building  block  cell 
which  uses  a "carry"  sense  dif- 
fusion to  set  a potential  barrier 
for  input  11  logic  correction 

Small  geometry  device  which 
uses  a charge  trap  and  regener- 
ation of  the  "carry"  output  to 
correct  the  input  Oil  logic  state 

An  8 bit  charge  splitting  CCD 
with  parallel  digital  inputs  to 
select  the  weighted  charge 
packets 

Serial  output  successive 
approximation  A/D  system  with 
MOS  comparator 

Serial  output  successive 
approximation  A/D  system  using 
analog  feedback 

CMOS/CCD  process 
compatibility  devices 


the  expected  resolution  is  8 bits  with  an  operating  speed  of  about  1 Mbit/  sec 
while  dissipating  10m  watts  or  less  (including  clock  power).  The  design 
goal  is  10M  bits/sec,  which  is  the  anticipated  performance  capability  for  the 
hurried  channel  version  to  be  finalized  in  early  1976. 

The  following  sections  examine  the  status  of  converter  technology, 
followed  by  a brief  discussion  of  current  analog  transform  technology. 

3.  1 CONVERTERS 

The  purpose  of  this  section  is  to  summarize  the  current  state  of  the 
art  in  Analog  to  Digital  and  Digital  to  Analog  converters.  Units  with  special 
features  are  highlighted  and  then  put  into  perspective  by  the  use  of  a com- 
parison table.  Major  emphasis  is  on  total  systems,  therefore  only  those 
subsystem  building  blocks  which  offer  valuable  and  unique  features  are 
presented. 

Linear  converters  are  presented  in  Tables  3.  1-1  and  3.  1-2.  Although 
other  types  have  significant  advantages  in  some  applications,  they  arc  not 
appropriate  for  the  AVE  task.  An  example  is  Precision  Monolithics'  com- 
panding D/A  that  follows  standard  nonlinear  speech  compression  laws. 

One  outstanding  new  device  is  the  Hughes  4 bit  monolithic  A.  D. 
encoder  which  has  been  operated  with  a 2.  5 nsec  converter  time;  by 
combining  four  of  these  devices,  a 6 bit  word  can  be  generated.  This  device 
dissipates  1.4  watts.  A second  device  is  a 6 bit  monolithic  D.  A.  converter, 
which  converts  in  6 nsec  and  dissipates  0.7  watts.  Another  Hughes  develop- 
ment is  the  6 bit,  200  Mword/sec  converter  which  has  been  demonstrated 

(it  utilizes  205  watts  of  input  power). 

A 5 bit  MOS  monolithic  clockless  A/D  converter  has  been 
described. It  utilized  portions  of  a continuously  variable  threshold 
device  and  achieved  conversion  times  of  2 psec;  power  dissipation  was  not 

reported. 

Another  unique  device  described  in  reference  3.  2-2  is  an  all  MOS 
successive  approximation  weighted  capacitor  A/D  conversion  technique.  It 
performs  a 10  bit  conversion  in  20  psec.  The  acquisition  time  is  25  psec, 
thus  the  conversion  rate  is  22  KHz. 
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tab  L,E  3.  1-1.  COMMERCIAL  A/D  CONVERTERS 


Co. 

Model 

Bits 

Time, 

psec 

Power 

1 

•if 

$ 

Computer  Labs 

9000 

13 

0.  1 

13, 980 

Datel 

ADC  HY12BC 

12 

.8 

2 w 

79 

EH12B3 

12 

2 

2. 325  w 

299 

Analog  Devices 

1103 

12 

3.  5 

5 . 1 w 

495 

Burr-Brown 

ADC  85  . 

12 

10 

0.  45  w 

225 

ADC  60 

12 

3.  5 

2.  85  w 

395 

Computer  LaLs 

9000 

12 

0.  1 

13,  980 

Analogic 

MP  2712 

12 

<4 

3.  3 w 

229 

TT.C  Data  Device  Corp. 

ADH- 10/1 

12 

0.  8 

990 

Teledyne 

4129QZ 

12 

24 

4132 

12 

3.  5 

! 

4133 

12 

2.  5 

Computer  Labs 

9000 

11 

0.  1 

8,  200 

Burr-Brown 

ADC  85 

10 

6 

185 

ADC  60 

10 

1.88 

2.  85  w 

395 

Ayden  Vector 

ADH- 10 

10 

25 

1. 025  w 

Analog  Devices 

1103 

10 

1.2 

5.  1 w 

484 

1123 

10 

65 

75  pJ/ 

299 

conversion 

ILC  Data  Device  Corp. 

ADH- 10/  1 

10 

0.  8 

Tele  dyne 

4131 

10 

1 

Datel 

ADC  CM10B 

10 

310 

90 

159 

Datel 

M10B 

10 

1 

3.  3 w 

895 

±20  ppm/ 

G10B 

10 

1 

. 1 w 

349 

±5  0 ppm/ 

°C 

Datel 

VH8B 

8 

0.  2 

8.  3 w 

895 

UH8B 

8 

0.  1 

8.  3 w 

995 

Datel 

ADC  CM8B 

8 

250 

90 

149 

J 

H 

a 

a 

0 

0 

a 

o 


• t.r* — = 


— — 
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(Table  3.  1-1,  concluded) 


Co. 

Model 

Bits 

T ime, 
psec 

Power 

❖ 

$ 

Analog  Devices 

1103 

8 

1 

5 . 1 w 

473 

Micro  Networks 

5060 

8 

100 

5 3 mw 

O' 

00 

5065 

8 

100 

5 3 mw 

8 

1 

1 . 125  w 

295 

Intech 

A-857-8 

8 

0.  8 

199 

Analog  Devices 

AD75705 

8 

35 

Hughes 

6 

0.  005 

205  w 

Hughes 

4 

0. 0025 

1.  4 w 

‘included  as  an  indicator  of  relative  complexity 


At  the  same  conference,  (ref  3,2-2),  R.  B,  Craven  presented  a 
bipolar  LSI,  12  bit  D/A  converter  consisting  of  a 97  x.  180  mil  Si-Cr  resistor 
network,  and  a 79  x 179  chip  of  active  circuitry;  power  dissipation  was  not 
reported. 

3.  2 ANALOG  TRANSFORM  TECHNOLOGY 

The  advent  and  development  of  CCD  recursive  and  non-recursive 
(transversal)  filter  technology  opens  the  door  to  a wide  variety  of  matched 
filtering  and  analog  correlation  signal  processing  previously  not  available 
for  analog  design.  This  section  briefly  reviews  the  current  status  of  CCD 
transversal  filters,  followed  by  an  example  involving  Walsh -Hadamard 
Transforms  using  transversal  filters. 

3.  2.  1 CCD  Transversal  Filter  Status 

3 2-4 

CCD  Cross-correlators  ' provide  a convolution  between  input 

and  reference  analog  signals.  A special  case  of  such  a circuit  is  the  CCD 
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Analog  Transversal  Filter  (TVF)'  ” " which  has  a fixed  set  of  appropriately 

weighted  reference  coefficients  that  multiply  incrementally  delayed  signal 
samples.  The  sum  of  the  weighted  time  samples  provides  the  convolution 
of  the  reference  function  and  the  signal. 


TABLE  3.  1-2.  COMMERCIAL  D/A  CONVERTERS 


Co. 

Model 

Bits 

Time, 

(isec 

Power 

* 

$ 

Analogic 

1916 

16 

3 

1 . 05  w 

485 

Burr  Brown 

DAC  70 

16 

50 

149 

Intech 

A856 

16 

8 

1.  300 

Analogic 

1915 

15 

2 

1 . 05  w 

440 

Analogic 

1914 

14 

1.5 

1.  05  w 

395 

Dynamic  Measurements 

13 

(12,  10,  8) 

350 

Datel 

DAC- 

HY12BC 

12 

I 0.3 
V 3.0 

1 . 05  w 

29 

FMI 

175-12 

12 

3.  5 

395 

Analog  Devices 

DACH08 
AD  563 

12 

12 

0.  150 
1.2 

780  mw 
525  mw 

122 

42 

Micro  Networks 

MN371 

12 

35 

90  mw 

Burr-Brown 

DAC  80 

12 

I 0.3 
V 3.  0 

800  mw 

25.  50 

Micro  Networks 

MN310-1 

10 

3 

500  mw 

79 

Burr-Brown 

AD7522 

10 

0.  5 

Computer  Labs 

10 

0.  066 

1, 010 

Micro  Networks 

MN316-1 

8 

1.  0 

0.  4 

59 

Analog  Devices 

AD7522-  (UP 
compatible) 

8 

0.  15 

Hughes 

6 

0.  006 

700 

Included  as  an  indicator  of  relative  complexity 


Test  chip  CCD  2091  shown  in  Figure  3.  2-1  consists  of  four  such  matched 
TVFs  with  several  variations  of  input  gain  and  sample  and  hold  circuits, 
differential  amplifiers,  a charge  comparator,  and  a charge  subtractor.  The 
chip  measures  0.  195  x 0.  195  inch  and  is  fabricated  by  using  p channel  over- 
laps ing  aluminum/polysilicon  electrode  structure  and  2:1  projection  alignment 
technology.  These  devices  replace  conventional  frequency  domain  analog 
filters  and  give  significantly  better  performance. 


COMPARATOR 

CIRCUIT^ 


FILTER 

1 


FILTER  2 


SUBTRACTOR 

CIRCUIT 


Figure  3.  2-1.  Hughes  2091  CCD  matched  filter  test  chip. 
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The  matched  filter  concept  can  be  summarized  by  the  functional  form 
of  a matched  filter  transfer  function,  i.  e.  , 

„ kS*(i2uf)  exp  H2irfA) 

G(j2irf)  = ; “2 

|N(j2uf)| 

where 

k = a constant, 

S*  = complex  conjugate  of  signal  frequency  spectrum,  which 
is  frequency  domain  equivalent  to  time  inverse  of  signal 

A = phase  factor,  which  corresponds  to  an  appropriate  time 
shift 

N(j2irf)]2  = noise  pcwer  density  spectrum  to  which  the  filter  is  to  be 
mismatched. 

Filter  1 has  19  delay  bits  followed  by  40  weighted  bits.  This  filter 
has  an  output  differential  amplifier  that  dissipates  about  100  pw  and  has  a 
bandwidth  of  30  kHz  and  a gain  of  3.  An  on-chip  sample  and  hold  circuit 
eliminates  clock  feedthrough  in  the  filter  output  waveforms  (see  Figure  3.2-2). 

Figure  3.  2-3  compares  the  theoretical  and  experimental  frequency 
responses  from  a filter  (#3)  at  a clock  frequency  of  31,  2 kHz.  The  slight 
discrepancies  observed  at  low  signal  frequencies  are  due  to  capacitive  imbal- 
ance between  the  positive  and  negative  sides  of  the  filter,  and  systematic  tap 
weight  errors,  which  can  occur  during  chip  fabrication.  Higher  frequency 
differences  are  caused  by  a combination  of  transfer  inefficiency  and  the  band- 
width of  the  filter  as  measured  by  the  delay  from  the  filter  input  to  the  last 
tap.  For  a 31.2-kHz  clock,  the  corresponding  bandwidth  is  637  Hz;  therefore, 
as  signal  components  increase  above  this  frequency,  the  effects  of  transfer 
inefficiency  become  important.  These  filters  have  also  operated  with  as  low 
as  a 300-Hz  clock  rate  at  room  temperature  with  insignificant  deterioration  in 
performance,  he.,  a slight  shift  in  fat  zero  which  reduced  dynamic  range  by 
about  1 dB.  This  suggests  acceptable  dark  current  levels  of  approximately 

10  na/cm  . (Filter  #2  was  not  tested). 

The  excellent  correlation  between  theoretical  and  measured  frequency 
response  indicates  the  feasibility  of  reducing  the  size  of  the  CCD  register 
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RELATIVE  AMPLITUDE,  dB 


while  maintaining  satisfactory  accuracy  of  the  tap  weights  for  greater  density 
and  higher  frequency.  Techniques  have  been  developed  recently  that  eliminate 
entirely  the  requirement  for  output  differential  amplifiers  in  transversal 
filters.  This  is  important  for  high  frequency  TVF  implementation  due  to 
the  unacceptable  common  mode  , ejection  and  power  dissipation  of  high  band- 
width differential  amplifiers. 

3.  2.  2 Walsh-Hadamard  Transform.  Domain  Signal  Processing  Devices 

Two  types  of  signal  processors  are  considered  for  the  APSP 
application:  i)  Real  time  pixel  space  processing,  and  2)  Real  time  transform 

domain  processing.  In  order  to  implement  the  second  type  of  signal  proces- 
sing (Reference  3.2-1),  a CCD  Walsh-Hadamard  Transform  ( WHT)  filter  is 
required  as  shown  in  Figure  3.  2-4.  A set  of  32  Hadamard  sequencies  (nor- 
malized frequencies)  is  generated  with  a set  of  32  WHT  filters.  Two  of 
these  filters  are  shown  in  Figure  3.  2-5.  The  filters  are  finite  impulse 
response  (FIR)  transversal  filters  with  binary  (±  1)  tap  weights. 

The  output  amplitude  of  each  WHT  filter  with  sequency  =0,  1,2 
. . . 31  is  the  projection  of  the  signal  vector  onto  the  32  Walsh  basis  vectors. 
This  amplitude  is  encoded  using  an  A/D  converter  with  a number  of  bits 
consistent  with  the  dynamic  range  at  the  output  of  each  filter.  The  ampli- 
tudes decrease  monotonically  for  increasing  sequency  if  the  input  signal  is 
an  optical  photo-generated  signal  (Lucosz  bound,  Reference  2). 


VIDEO  32  HADAMARD  COMPRESSED 


Figure  3.  2-4.  Adaptive  Hadamard  Transform  processor. 


3-11 


The  amplitude  statistics  in  the  sequency  domain  are  generally  well 
balanced  Gaussian  statistics  with  zero  mean  value,  except  for  sequency  0. 
Conventional  classical  coding  rules  can  be  applied  to  the  encoding  of  each 
WHT  filter  output  with  a resultant  overall  reduction  in  the  number  of  bits, 
compared  to  that  required  for  the  original  signal  amplitude,  A 
3:1  reduction  has  been  obtained  for  standard  TV  picture  coding,  without 
noticeable  degradation.  The  concept  of  transform  domain  processing  can 
be  extended  with  orthogonal  functions  such  as  the  cosine  functions  and  others. 
The  advantages  of  the  Walsh-Hadamard  Transform  are  its  binary  character- 
istics and  ease  of  implementation.  The  suitability  of  the  WHT  for  APSP  is 
discussed  in  the  Processor  Architecture  report  CDRL  A006. 


Figure  3.2-5.  Dual  16  Element 
Hadamard  Filter  Chip 
No.  2088. 
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4.  0 DEVICE  TESTING 


As  part  of  this  survey,  several  devices  in  various  technologies 
designed  and  fabricated  by  Hughes  were  tested  in  conjunction  with  the 
evaluation  programs  associated  with  each  device.  In  general  these  tests 
involved  one  or  a few  of  the  devices  on  the  chip  concerned,  and  typically 
only  those  parameters  of  interest  to  the  APSP  application  were  evaluated. 

The  data  in  many  cases  does  not  represent  optimized  process  iterations.  It 
does,  however,  provide  an  insight  into  potential  performance  capabilities 
and  demonstrates  the  involvement  of  Hughes  in  virtually  all  of  the  state  of  the 
art  technologies,  generally  in  considerable  depth.  The  data  is  presented  in 
a variety  of  formats,  including  tabulations,  computer  print- outs,  graphs 
and  oscilloscope  trace  photographs. 

4.  1 I2L  DEVICES 

This  section  provides  test  data  collected  on  a first  iteration  I2L 
process  under  evaluation  at  the  Hughes,  Newport  Beach  facility. 

The  chip  is  identified  as  the  Hughes  Model  2100,  and  contains  32 
separate  devices.  Figure  4,  1-  1 is  a photograph  of  the  2100  chip  with  a 
brief  description  of  the  devices,  ihose  evaluated  in  this  report  are  (a)  the 
Ring  Oscillators  (R.  O.  ) No.  6 and  No.  9,  (b)  the  eight  stage  shift  register 
No.  17,  (c)  the  ten  stage  frequency  divider  No.  15  and  (d)  the  4 bit  adder 
No.  18.  The  Ring  Oscillator  No.  9 is  actually  two  15  stage  oscillators 
called  No.  9 and  No.  9B.  No.  9B  is  a smaller  geometry  version  of  No.  9 
and  No.  6. 

Two  different  chips  were  tested  as  ring  oscillators,  one  is  designated 
A-7  and  the  other  A- 3.  A-7  is  a standard  I L process  called  down  diffused 

(or  implanted)  to  distinguish  it  from  the  improved  process  called  up  diffused 


2 . 

densities.  Of  the  two  most  attractive  candidates,  T E and  CCD,  projected 

for  the  early  1980's,  CCD  technology  offers  a number  of  advantages  over 
2 

I L;  the  most  important  being  small  element  size  and  low  power  dissipa- 
tion. CCD  memories  are  dynamic  devices  and  consume  little  power  in  the 

2 

standby  mode  of  operation  (even  with  refresh),  while  I L,  memories  consume 
considerably  more  power.  CCD  memory  organized  in  a sorial-parallel- 
serial  (SPS)  arrangement  also  offers  the  greatest  bits  per  chip  density  and 
concurrently  minimizes  peripheral  circuitry  and  access  times.  An  important 
attribute  of  the  SPS  organization  is  that  most  of  the  charge  transfers  are 
done  at  a low  frequency  which  vastly  reduces  the  effective  power  delay 
product. 

4.1  CCD  MEMORIES 

Present  state-of-the-art  photolithographic  techniques,  as  used  on 
the  Hughes  2069  chip  of  Figure  4.  1-1,  have  the  resolution  capability  of  pro- 
ducing 0.  7 mil/bit  (18  pm)  shift  registers.  Advanced  photolithographic 
techniques  utilizing  4:1  reductions  can  produce  optimal  shift  registers  of 
0.4  mil/bit  (10  pm).  Future  processing,  however,  will  utilize  high  resolu- 
tion electron  beam  (E-beam)  technology.  An  internally  funded  program  has 


Figure  4.  1-1.  Hughes  32K  bit  CCD  memory  (chip  2069) 
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which  is  used  in  the  A-3  chip.  Details  of  this  process  are  described  in  the 
Hughes  Patent  Disclosure  No.  75207.  A set  of  six  graphs  is  shown  in  Fig- 
ures 4.  1-2  to  4.  1-7.  The  first  three  compare  the  processes  at  room  tem- 
perature and  the  second  three  show  the  effect  of  temperature  on  the  A-3  chip. 

Figure  4.  1-2  shows  time  delay  per  stage  as  a function  of  supply 
current  per  stage.  Oscillators  Number  6 and  Number  9 have  the  same 
geometry  but  Number  9 has  improved  isolation  with  less  capacitance,  result- 
ing in  lower  delay  than  Ring  Oscillator  Number  6.  The  lowest  delay  is  Ring 
Oscillator  Number  9b  as  a result  of  its  smaller  geometry. 

Figure  4.  1-3  shows  power-delay  'roduct  versus  current,  and  again 
the  same  trend  is  evident,  except  that  now  oscillator  9 of  chip  A-3  shows  a 
substantial  improvement  over  the  smaller  geometry  9b  of  chip  A-7,  depicting 
the  process  improvement.  Figure  4.  1-4  shows  the  output  signal  level  for  the 
same  ring  oscillators.  A IK  load  resistor  was  used  to  obtain  accurate  wave 


Figure  4.  1-2.  Ring  oscillator  time  delay  per  stage  versus  stage  current. 
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Figure  4.  1-3.  Power  delay  product  versus  stage  current 
for  various  ring  oscillators. 

forms  at  the  higher  speeds.  An  increase  in  output  level  can  be  obtained  with 
larger  load  resistors.  In  chip  A-7  the  value  of  the  output  supply  voltage 
was  somewhat  critical,  for  reasons  not  completely  understood  at  this  time, 
with  1.75  Volts  giving  the  largest  output.  A-3  was  not  as  sensitive  to  vari- 
ations in  this  voltage  (3  V was  used). 

The  characteristics  of  A-3  with  temperature,  shown  in  Figures  4.1-5, 

6,  7 establish  clearly  that  the  performance  improves  with  temperature, for 
the  parameters  checked.  This  is  due  largely  to  the  improvement  in  beta  and 

the  lowered  emitter-base  voltage  drop. 

All  of  the  ring  oscillators  used  had  fifteen  stages  connected  as  shown 
in  Figure  4.  1-8.  The  lateral  PNP's  have  a (3  of  about  10  and  must  obviously 
be  well  matched  so  that  the  supply  current  will  be  evenly  distributed  through- 
out the  .stages.  This  does  not  pose  a problem  with  normal  processing.  A more 
practical  problem  is  that  of  furnishing  an  efficient  power  supply.  As  shown 
by  Figure  4.  1-5  the  supply  voltage  is  strongly  dependent  upon  temperature. 
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Figure  4.  1-4.  Output  voltage  as  a function  of  stage  current  for 

various  ring  oscillators. 

This  comes  as  no  surprise  since  it  is  simply  the  emitter  to  base  drop  of  the 
lateral  PNP.  For  the  purposes  o£  this  test,  as  in  practically  all  other  pub- 
lished  I L data,  a large  resistor  (10K  to  100K)  was  used  in  series  with  the 
device  to  control  the  input  current,  and  the  power  dissipation  for  the  power 
delay  products  of  the  device  were  calculated  by  using  the  device  voltage  drop 
rather  than  the  much  larger  supply  voltage.  Clearly,  when  I^L  is  compared 
with  other  forms  of  logic,  the  unique  loading  of  the  power  supply  and  its 
losses  must  be  included.  This  could  well  make  a significant  difference  in 
the  results  of  the  comparison. 

Another  factor  of  concern  is  cross  talk  or  feedback,  since  there  is 

2 

no  decoupling  in  the  basic  I L circuit.  This  is  another  factor  vhich  leads  to 
quoting  optimistic  power  levels  fcr  I^L,  since  there  is  usually  a substantial 
power  loss  associated  with  decoupling  in  discrete  circuits. 


However,  favoring  I^L  technology  in  this  respect  is  the  fact  that  it  is  limited 
to  digital  circuits,  and  therefore  should  not  be  compared  to  the  decoupling 
requirements  of  analog  devices.  Coupling  problems  will  become  more 
critical  in  large  arrays  where  the  ohmic  resistance  of  the  interconnects 
becomes  an  important  factor. 

Device  A- 10  included  a shift  register  which  was  operated  as  shown 
in  Figure  4.  1-9. 

Since  the  shift  register  triggers  on  a high-to-low  transition,  the  cir- 
cuit shown  in  Figure  4.  1-9  was  used  to  insure  that  no  false  triggers  affected 
testing.  At  the  maximum  clock  frequency  of  100  KHz,  the  output  was  lagging 
by  almost  1/2  clock  cycle  (5  p sec),  which  corresponds  to  a delay  of 
0.  625  p sec/stage.  The  output  supply  was  adjusted  to  +3.  6 V for  a maximum 
output  of  0.  3 V.  The  power  input  was  set  to  +11  V through  a 10K  resistor. 
Eleven  volts  was  selected  as  being  in  the  center  of  the  9-12.  5 V operating 
range  of  the  device.  The  actual  chip  supply  was  a higher  than  expected 
(1.  8 volts).  Because  of  the  limited  operating  voltage  range,  it  was  not  pos- 
sible to  get  data  on  speed  versus  supply  current.  The  reason  for  this  voltage 
limitation  is  not  fully  understood  at  this  time.  In  addition,  the  maximum 
frequency  of  operation  should  have  been  on  the  order  of  5 Mhz,  rather  than 
the  observed  100  KHz.  The  causes  of  all  of  these  limiting  conditions  are 
presently  under  investigation.  In  all  cases,  the  above  is  preliminary  data, 
realized  from  1 each  of  5 devices  on  the  chip. 


Figure  4.  1-9.  Test  circuit  for  shift  register. 
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Tests  on  the  4 bit  look  ahead  adder  resulted  in  a maximum  of  1 Mhz 
operation  with  a supply  current  of  2 m.  a.  at  2.  2 volts.  This  device  has 
63  transistors.  Tests  on  properly  operating  frequency  dividers  have  reached 
5 Mhz. 

4.  2 PERISTALTIC  CCD 

The  Hughes  2096  test  chip  includes  a peristaltic  64  bit  shift  register. 
This  section  provides  test  data  for  an  N channel  device  operating  at  a 
103  MHz  clock  rate.  Figure  4.2-1  is  a schematic  diagram  of  the  device. 

Those  tested  were  fabricated  on  a 3.5  pm  thick  N expitaxial  layer.  Gate 
length  is  1.2  mils /bit  and  gate  width  is  4.  0 mils.  The  register  is  docked  at 
103  MHz  with  24s,  10  V P- P sinewave  clocks  illustrated  in  Figure  4.2-2. 

The  CCD  input  consists  of  a modulated  3rd  gate  with  the  first  two  gate 
electrodes  biased  to  form  a current  source.  The  output  sense  diffusion  was 
connected  directly  to  a 25017  load  resistor.  The  reset  gates  were  not  used. 

The  device  was  connected  as  in  Figure  4,2-1  with  bias  conditions 
listed  in  Table  4.2-1.  The  input  gate  (4>in3)  was  modulated  with  a pulse 
having  4 ns  rise  and  fall  times.  The  input  and  delayed  output  (10  ns  x 64  bits  = 
640  ns)  are  shown  in  Figure  4.  2-4. 

Frequency  response  and  transfer  efficiency  can  be  estimated  from  the 
rise  time  of  the  CCD  output  wave  form.  (The  source  of  the  output  ripple  in 

the  early  part  of  the  trace  has  not  yet  been  identified.  The  observed  output 


0.0V  DC 
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rise  time  of  1 3.  6 nsec  is  determined  by  a)  input  pulse,  b)  register,  and  c) 
output  circuits.  The  input  rise  time  (4  ns)  and  output  rise  time  (8.  5 ns)  are 
known,  therefore  the  register  bandwidth  can  be  estimated  from  a magnitude 
response  function  of  three  cascaded  single  pole  functions.  This  calculation 
leads  to  a 45  MHz  bandwidth,  which  is  consistent  with  the  theoretical  sm  x/ 

x response  at  a 10  / MHz  clock  rate. 

The  maximum  bucket  capacity  can  be  estimated  from  the  peak-to-peak 

output  voltage  swing  across  the  load  resistor.  The  maximum  voltage  swing 
is  37.5  mv.  , the  test  circuit  output  capacitance  was  about  11  pf;  from 
AQ  = CAV,  AQ  = 0.  41  pc. 

Thus 


Bucket  capacity 


^ £-9 — ~ 2.  6 x 10^  electrons 


1.  6 x 10 


19 


These  results  show  device  operability  and  represent  an  approximation 
of  the  registers'  ultimate  performance  capability  since  the  test  equipment 
available  prevented  more  accurate  determination  of  performance.  Test 
results  indicate  transfer  efficiency  of  >0.  999  and  bandwidth  of  45  Mhz  at  a 
clock  frequency  of  103  Mhz.  Further  evaluation  at  higher  clock  frequencies 
using  appropriate  test  equipment  will  more  accurately  determine  the  limita- 
tions of  the  device.  These  results  compare  favorably  with  data  presented  by 
Rockwell  International  at  the  1975  CCD  Applications  Conference  at  San  Diego 
(>0.  999  at  105  Mhz  clock).  Phillips  Research  Labs  achieved  >0.  9999  at  fre- 
quencies of  100  Mhz  in  1973.  The  2096  has  comparable  performance  when 
the  I/O  circuit  effects  are  normalized  out. 

4.3  CMOS/SOS 

Many  companies  in  the  industry  are  investigating  the  CMOS/SOS 
process.  The  Newport  Beach  facility  of  Hughes  is  well  along  in  the  develop- 
ment of  advanced  techniques  foi  implementing  CMOS/SOS  (Figure  4.2-4). 

In  particular  a considerable  amount  of  effort  has  been  invested  in  the  imple- 
mentation of  minimum  geometry  CMOS/SOS  devices  with  poly  silicon  gates. 
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When  standard  metal  gates  are  used  the  overlap  of  the  source  and  drain 
diffusions  underneath  the  metal  gate  area  add  an  extra  capacitance  factor. 

In  addition  the  difference  in  the  Fermi  levels  of  the  metal  and  the  semi- 
conductor cause  a work  function  term  (Q>mS>  in  the  threshold  voltage  equation. 
If  a polysilicon  gate  is  used  (as  illu  trated  in  Figure  4.3-1  for  an  n-channel 
device)  the  gate  material  is  the  same  as  the  semi-conductor,  causing  the 
work  function  to  drop  to  zero  and  decreasing  the  threshold  voltage. 

Decreased  threshold  voltage  in  turn  decreases  both  the  supply  voltage  and 
the  switching  voltage  required,  improving  the  speed-power  product.  In  addi- 
tion there  is  no  overlap  capacitance  involved  in  a polysilicon  self  aligned 
gate.  Polysilicon  gates  are  thus  created  through  a self-aligned  proce' s and 
the  devices  have  better  fanout  capabilities  due  to  the  reduced  capacitance. 
Gate  capacitance  is  reduced  about  40%  compared  to  that  with  bulk  silicon. 
Hughes  Newport  Beach  facility  utilizes  an  electron  beam  (EBIM)  process  for 
accurately  aligning  the  structure  and  providing  the  ion  implantation  used  in 

the  process. 

The  data  summarizes  the  present  status  of  CMOS  on  sapphire  tech- 
nology (data  compiled  f"  m Hughes,  Newport  Beach). 


For  a minimum  size  inverter: 

channel  width 
channel  length 
capacitance  / mils 

supply  voltage 


= 0.  6 mil 

= 0.  3 mil 

, 2 

= 0.  2 pf/mil 
= 10V 


Figure  4.  3-  1.  N channel  MOST  self  aligned 
gate  structure. 
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CHIP  SIZE:  130  x 138  MILS 


Figure  4.  3-2.  CMOS/SOS  256  bit  static  shift  register. 


oxide  thickness 

= P/f  = speed  x power 


1000  A 
? 

C V 

s 

3,  6 pj  (advanced  circuit) 


Hughes  recently  constructed  a 256  bit  CMOS  shift  register  on  a 130  by  138  mil 
chip  containing  4200  transistors  (Figure  4.3-2).  The  basic  unit  cell  building 
block  contained  16  transistors  on  a 25.6  mil'  area,  to  give  some  idea  of  the 
packing  density  involved.  The  speed  power  product  cf  3.  6 pJ  is  a good  figure 
lor  contemporary  LSI  functions.,  Channel  lengths  of  0.  1 mil  were  obtained. 
With  some  devices,  power  supply  voltages  as  low  as  1.  5 volts  were 
demonstrated. 


COMPLETE  CHIP 
4200  TRANSISTORS 


UNIT  CELL 
16  TRANSISTORS 
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4.4  WALSH- HADAMARD  FILTER 


The  WHT  transversal  filters  incorporated  in  the  2088  chip  (described 
in  Section  3.  2.  2)  were  operated  at  a 1 0 MHz  clock  rate.  The  iilters  have  a 
1.  2 mil  bit  length  and  are  implemented  with  buried  P channel  technology. 
Figure  4.4-1  illustrates  the  impulse  response  for  a sequency  8 device.  The 
impulse  response  represents  a Walsh  function  (±1  tap  weights)  for  which  the 
sequency  is  the  number  of  zero  crossings. 


Figure  4.4-1.  Hadamard  Filter  Impulse  Response 
Hughes  Chip  No.  2088,  Sequence  8;  10  MHz  Clock. 
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4.  5 CCD  COMPATIBLE  BIPOLAR  DEVICE 


The  Hughes  2096  chip  contains  several  interface  circuits  of  the 
Bipolar-MOS  (BIPMOS)  configuration  to  provide  information  on  the  capability 
of  bipolar  devices  with  MOS  devices  and  buried  channel  CCDs.  In  addition, 
the  BIPMCS  circuit  can  be  utilized  as  an  output  buffe r- amplifier.  This 
section  provides  detailed  characterization  of  two  of  the  bipolar  transistors 
on  the  chip.  Section  4.  6 provides  the  results  of  SPICE  computer  modeling 
and  testing  of  the  bipolar  device  in  conjunction  with  a MOS  common  source 
amplifier. 

Devices  from  lot  9,  wafer  8 and  lot  14,  wafer  10,  fabricated  using 
the  mask  set  #2096  were  tested  in  order  to  characterize  the  vertical  NPN 
transistor  proposed  for  use  in  the  BIPMOS  driver  circuit.  The  #2096-9 
version  is  a non- isolated,  N substrate  device  aud  the  #2096-14  version  is  an 
isolated,  P substrate  device.  The  emitter  and  collector  saturation  currents 
and  emission  coefficients  were  calculated  from  c.  least  squares  curve  fit  of 
the  DC  base  to  emitter  voltage  versus  the  logarithm  of  the  emitter  current 
in  both  tne  forward  and  reverse  mode  of  operation.  The  low  frequency 
emitter  and  collector  resistances  were  measured  by  saturating  the  transistor 
with  a current  source  base  drive  and  then  calculating  the  resistances.  The 
results  of  these  tests  are  summarized  in  Table  4.  5-1. 

The  procedures  used  in  making  the  small  signal  and  junction  capaci- 
tance measurements,  summarized  in  Table  4.5-2,  were  an  automated  version 
of  essentially  those  described  in  Reference  4.  5-1,  except  as  follows: 

1.  Cje  and  Cjc  curves  were  obtained  by  using  a new  measurement 
technique  (relative  to  the  one  discussed  in  Ref.  4.5-1.  The 
new  technique,  suggested  by  Hewlett  Packard  personnel, 
requires  the  use  of  an  HP  4271  LCR  meter.  As  shown  in  Fig- 
ure 4.  5-1  (a),  due  to  the  manner  in  which  operational  amplifiers 
are  incorporated  in  the  instrument  sensing  circuitry,  the 
HP  4271  behaves  very  nearly  as  an  ideal  voltage  source  driving 
the  test  capacitance  with  the  current  through  the  test  capacitance 
sensed  at  virtually  zero  impedance.  This  is  suggested  in  the 
equivalent  circuit  shown  in  Figure  4.  5 -1(b).  These  character- 
istics of  the  HP  4271  nullify  the  effects  of  shunt  parasitic 
capacitances  Cg  1 and  Cg;?  also  shown  in  Figure  lb.  (The  current 
through  Cgi  is  not  measured  and  no  current  flows  through  Cg;?). 
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kT/q 


HP4271 


(a)  HP4271  IN  CAPACITANCE 
MEASUREMENT 


(b)  EQUIVALENT  CIRCUIT 


NOTE  - CIRCUITRY  INTERNAL  TO  HP4271  FOR  APPLYING  DC  BIAS  TO  TEST 
(E.G.  JUNCTION)  CAPACITANCE,  Cx  NOT  SHOWN. 


Figure  4.  5 -1 . 


Test  configuration  for  Cjg 


and  C 


JC 


measurements. 


Consequently,  Cjtt;  and  Cjp-  capacitance  versus  junction  voltage 
curves  may  be  made  without  corresponding  "Pads  Only"  capaci- 
tance measurements.  Inctead  both  measurements  are  made  by 
connecting  the  HP  4271  directly  across  the  junction  of  interest. 

In  either  case,  all  other  junction  and  fixed  parasitic  capacitances 
will  be  found  to  act  as  shunt  capacitances  (Cgl  and  CS2)  to  an 
"exterior"  ground.  Unfortunately,  this  technique  can  NOT  be 
used  in  measuring  Cp;g  versus  V^g.  This  is  the  case  because 
(at  least)  a constant  pad  plus  header  parasitic  capacitance  appears 
in  parallel  with  the  collector-substrate  junction  even  if  the  sub- 
strate is  insulated  from  the  header  (e.  g.  , by  an  insulating  epoxy 
bond).  Consequently,  a pads  only  device  was  used  in  collector - 
substrate  capacitance  measurements. 

2.  f-p  measurements  were  made  using  a pulse  response  test.  fp> 

measurements  are  usually  made  by  determining  the  small  signal 
short  circuit  current  gain  of  a fixed  frequency,  fQ.  This  fre- 
quency should  be  at  least  one  octave  above  fp  and  at  least  two 
octaves  below  f<p.  In  this  case,  the  C-p  approximation  to  the 
hybrid-pi  model  is  valid,  assuming  small  (nearly  negligible) 
external  and  extrinsinc  collector  resistance  in  the  measurement 
setup.  Second  order  effects,  neglected  in  the  hybrid-pi  model, 


make  results  invalid  for  fQ  near  fT.  For  most  ECL  transistors 
tested  fT  2 1 GHz  and  % £ 10  MHz.  Therefore,  our  me;  'ure- 
ments  are  made  at  fc  =100  MHz.  Special  instrumentation  for 
these  measurements  incorporate  tuned  circuitry  (described  m 
Ref.  1)  which  will  only  operate  at  100  MHz.  Consequently,  the 
instrumentation  can  not  be  used  for  accurate  hp  measurements 
on  devices  with  fT  < 400  MHz.  This  is  also  true  for  RB  base 
resistance  tests  since  they  requi:  e similar  instrumentation. 

From  preliminary  measurements  made  on  devices  from  wafer  10,  it 
was  determined  that  fT  was  below  400  MHz.  It  was  therefore  necessary  to 
use  an  alternate  technique  which  involves  applying  a step  voltage  signal  to 
the  transistor  in  the  common  emitter  configuration.  Then  with  R^  + R q 
< <R  the  collector  waveform  may  be  analyzed  to  determine  f„.  The 

IT’ 

measurement  setup  used  is  shown  schematically  in  Figure  4.  5 2. 

A small  (e.g.  , 200  mV)  voltage  step  is  applied  with  Vgg  initially 
biased  sufficiently  to  just  make  the  device  active  (e.  g.  , 0.  7 V).  The  output 
voltage,  shown  in  Figure  4.  5-3,  is  then  given  by: 


V 

o 


P VIN  RL 

Rs  + Rw 


e 


-t/RpCT) 


The  final  output  current  can  be  calculated  from  Figure  4.  5-3  to  be 
60  mV/25  ohms  = 2.4  mA.  Given  beta  = 34  at  this  current  (see  Figur  e 4.  5 -1 1 ) 
Ru  can  be  calculated  at  the  final  output  current: 

R tt  = p VT/IE  = 368  ohms 


The  total  base  resistance  and  therefore  Rg  can  be  calculated  from  the  final 
voltage  gain  (e.  g.  , 60  mV /200  mV): 

Av  = P V(Rs+R*> 

R = 2K  + Rg  = P Rl/Av  - R"  = 2465  °hmS 

R = 465  ohms 
B 
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Figure  4.5-3.  Output  Waveform 
for  step  input. 

Note  that  variations  of  R with  output  current  will  be  effectively  swamped 

B 

out  by  the  2K  series  resistor. 

From  Figure  4.  5-3,  V0  = 25  mV  at  t = 22  nsec,  and  therefore 

I_  - 1 mA.  Then  using  the  previous  equations: 
sL 


rtt 

= 936  ohms 

Rs 

= 2.47  K 

R 

= 679  ohms 

P 

-t/R  CT 
P 1 

= 0.527 

R Crp 

= 34. ? nsec 

P T 

RpCT 

= rsIr,ct  )/(rs  + r4 

= P Rg/(Rg  = R^)  (2Tr  ^ 

£t 

= 121  MHz  at  = 1 mA 

E 

Then 
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Figure  4.5-11.  1 MHz  pad  and  collector  to  substrate 

capacitance  vs  junction  voltage. 


4.6  BIPMOS 


The  2096  BIPMOS  (bipolar-MOS)  CCD  output  buffer  demonstrates  the 
compatibility  of  MOS  and  bipolar  processes  and  examines  one  buffer  config- 
uration. The  device  was  tested  in  the  lab  and  the  test  results  were  compared 
with  the  computer  analysis. 

The  device  structure  is  shown  in  Figure  4.6-1.  It  consists  of  a 
vertical  NPN  bipolar  output  transistor  with  an  N channel  MOS  transistor. 

The  NPN  bipolar  is  operated  as  an  emitter  follower  for  current  gain  and  the 
MOS  is  operated  common  source  for  voltage  gain.  The  circuit  is  shown  in 
Figure  4.6-2.  The  device  achieves  the  required  voltage  and  current  gain 
necessary  for  CCD  output  buffering. 


Figure  4.6-1.  BIPMOS  structure. 


Figure  4.6-2.  BIPMOS  circuit  diagram. 


The  BIPMOS  device  was  modeled  using  the  SPICE  computer  aided 
design  program  in  the  configuration  shown  in  Figure  4,  6-3.  The  circuit  was 
analyzed  for  several  values  of  base  resistor  using  device  parameters  calcu- 
lated from  physical  dimensions  and  parameters  measured  on  the  bipolar 
transistor  as  described  in  Section  4.  5. 

Tables  4.6-1  and  4.6-2  show  the  model  parameters  used  for  the 
bipolar  and  MOS  portions  of  the  device.  Figure  4.  6-4  shows  the  measured 
and  calculated  curves  of  optimum  gate  bias  and  voltage  gain  versus  base 
resistance.  Bandwidth  versus  base  resistance  is  shown  in  Figure  4.6-5. 
Attachment  4.6-1  is  a typical  computer  run  showing  dc  transfer  and  ac 
analysis  plots. 

The  results  of  the  computer  analysis  show  reasonable  correlation 
with  measured  data,  indicating  a useful  model.  The  BIPMOS  device  has 
demonstrated  the  compatibility  of  processes  and  could  be  a useful  output 
buffer  using  a high  frequency  bipolar  transistor. 
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TABLE  4.  6-1.  BIPMOS  MODEL  PARAMETERS  - BIPOLAR 


Parameter 


Symbol 


Value 


Forward  beta 

BFM 

210 

Base  resistance 

RB 

1. 6 K ohms 

Emitter  resistance 

RE 

1 1 ohms 

>!< 

12.  5 ohms 

Collector  resistance 

RC 

Collector- substrate 
capacitance 

CCS 

3.  0 pf* 

Forward  transit  time 

TF 

83  ps 

Base- emitter  junction 

CJE 

3.  06  pf* 

Base-collector  junction 
capacitance 

CJC 

2.  23  pf* 

Forward  knee  current 

IK 

1.  12  amps 

Base-emitter  junction 
potential 

PE 

0.7  V 

Base-collector  junction 
potential 

PC 

0.7  V 

Measured  values. 

TABLE  4.  6-2.  2096  BIPMOS  MODEL  PARAMETERS  - MOS 

Parameter  I Symbol Value_ 


1) 

Threshold  voltage 

VTO 

2) 

Surface  potential 

Phi 

3) 

Transconductance 

Beta 

4) 

Bulk  threshold 

Gamma 

5) 

Gate-source  capacitance 

CGS 

6) 

Gate-drain  capacitance 

CGD 

7) 

Gate-bulk  capacitance 

CGB 

8) 

Base-drain  junction 
capacitance 

CBD 

9) 

Base-source  junction 
capacitance 

CBS 

10) 

Bulk  junction  potential 

PB 

1 V 
0.7  V 
3. 0 x 10' 
0.  73  V 
0.  026  pf 
0.  013  pf 
0.46  pf 
0.  09  pf 

0.  09  pf 

0.7  V 
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5.0  CONCLUSIONS 


Existing  LSI  technologies  have  been  reviewed  and  state-of-the-art 
new  technologies  that  ha^e  potential  for  LSI  have  been  identified.  These  are 
summarized  in  Figure  5.  0-  1,  along  with  projections  of  the  corresponding 
speed  and  power-delay  products  for  the  early  1980's. 

Improvements  in  fabrication  resolution  using  electron  beam  tech- 
nology are  expected  to  increase  resolution  in  the  next  ten  years  by  approxi- 
mately an  order  of  magnitude.  These  improvements  will  apply  to  all  tech- 
nologies. Insulating  substrates  and/or  insulator  logic  cell  isolation  will 

further  decrease  capacitance  and  increase  speed. 

Assuming  that  industry  and  government  funded  research  and  develop- 
ment continue  at  their  present  levels,  it  is  highly  likely  that  at  least  one 
technology  will  be  available  in  the  early  1980's  with  logic  LSI  implementation 
capable  of  providing  0.  4 nsec  gate  delays  and  a 0.  08  p Joule  power -delay 
product.  For  APSP  design  purposes,  these  figures  are  considered  adequate. 
DMOS  and  CMOS /SOS  will  emerge  as  competitors  and  both  may  exceed  that 

performance. 

Thus  more  than  an  order  of  magnitude  improvement  in  power- delay 
product  and  an  order  of  magnitude  reduction  in  gate  delay  beyond  presen* 

LSI  DMOS  should  be  demonstrated  capability  by  the  early  1980's.  This  does 
not  even  consider  advanced  technologies  yet  to  be  conceived  or  reported. 

For  example,  if  a combination  of  low  voltage  complementary  DMOS  on  an 
insulator  were  to  be  implemented  within  the  decade,  utilizing  advanced 
electron  beam  techniques,  LSI  gate  delays  of  200  pico  seconds  and  power-  . 
delay  products  of  0.  01  pico  Joules  might  be  expected,  plus  increased  circuit 

density. 


Should  the  potential  of  CMOS/SOS  or  DMOS  fail  to  be  realized,  a 
backup  alternative  for  A PSP  digital  design  using  I2L,  provides  an  estimated 
factor  of  eight  reduction  in  power-delay  product,  with  about  a factor  of  twenty 
increase  in  propagation  delay.  This  is  based  upon  the  anticipated  I^L  tech- 
nology of  the  early  l°80's,  i.  e.  , 8 nsec  delays  and  0.011  pico  Joule  power- 
delay  products. 

Not  apparent  on  power-delay  curves,  however,  is  the  advantage  of 
CMOS  operating  at  lower  than  maximum  speed  relative  to  technologies  that 
draw  large  standby  current.  The  static  power-delay  product  of  present  com- 
mercial CMOS  is  0.002  pico/joule  at  5 volts  due  to  leakage  current.  Unless 
the  large  usage  (duty  cycle)  is  very  high,  the  effective  power-delay  product 
for  systems  is  greatly  decreased  for  CMOS  since  the  power  used  is  propor- 
tional to  duty  cycle.  It  becomes  clear  that  very  fast  CMOS/SOS  is  preferred. 
An  overall  logic  duty  cycle  of  10  percent  (not  unrealistic)  makes  ^82  CMOS/ 
SOS  competitive  with  I^L  in  effective  power -delay  with  the  additional  benefit 
of  higher  speed  (20X)  for  arithmetic  functions  requiring  it. 

Device  LSI  density  will  play  a major  role  in  the  tradeoff.  RAM  mem- 
ory will  probably  follow  the  same  technology.  Larger  serial  memories 
requiring  high  duty  cycles  may  follow  either  CCD  or  I^L  technologies 
depending  on  the  overall  per  chip  power. 

Table  5.0-1  depicts  present  and  projected  densities  and  projected 
power-delay  products.  A figure  of  merit  that  combines  gate  area  with 
power -delav  is  shown  in  the  last  column  of  the  table.  Though  tne  figure  of 
merit  for  CMOS/SOS  appears  lowest,  duty  cycle  was  not  included  in  the  fig- 
ure of  merit.  As  discussed,  the  duty  cycle  dependent  power  saving  for 
complementary  technologies  (CMOS)  is  probably  going  to  be  the  deciding 
factor.  Duty  cycle  is  strongly  dependent  upon  system  architecture  and  can- 
not be  factored  into  a figure  of  merit  at  this  time.  An  all  encompasing 
figure  of  merit  would  also  inclu  e in  some  manner,  the  computing  power 
concept  discussed  in  section  2.4.5.  Qualitatively,  CMOS/SOS  leads  I L 
and  DMOS  in  the  above  considerations. 

The  theimal  limitations  associated  with  dense  high  speed  logic  form 

a critical  design  limitation.  Figure  5.0-1  includes  a scale  which  assumes 

/ 2 

sufficient  chip  and  header  thermal  conductivity  to  remove  100  m watts /mm  . 
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Figure  5.0-1.  Power  delay  products  of  various  technologies 
showing  1975  LSI,  1975  ring  oscillator,  and 
projected  1982  capabilities 


This  corresponds  to  625  m watts  for  a 100  x 100  mil  chip  and  is  probably  an 
upper  bound  for  LSI  space  applications  without  liquid  coolant.  It  thus  appears 
that  technologies  exceeding  1000  gates/mm2  with  more  than  1 m watt/gate 
dissipation  mav  be  thermally  limited  without  associated  advancement  in  j^SI 
thermal  design. 

Analog  CCD  technology  is  expected  to  become  increasingly  popular  in 
signal  processing  and  other  applications.  The  Adaptive  Video  Encoder  is 
expected  to  exploit  the  unique  characteristics  of  CCD's,  both  analog  and 
digital.  CCD  logic  appears  to  have  application  where  associated  with  CCD 
memory,  which  has  a strong  future  for  serial  memory  systems. 

Microprocessor  evolution  will  continue  at  a tremendous  rate  of 
growth,  providing  more  complex  functions  and  memory  per  chip,  and  obtain- 
ing 25  to  60  nanosecond  cycle  times.  CMOS  technology  will  be  widely  used, 
followed  probably  by  DMOS.  CMOS/SOS,  though  attractive  in  performance 
is  not  likely  to  be  pursued  by  commercial  interests  due  to  the  cost  of  sapphire 
processing. 
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1.0  INTRODUCTION 


This  report  presents  the  results  of  the  Critical  Device  Design  task 
of  the  Adaptive  Programmable  Signal  Processor  (APSP)  program,  and  is 
identified  as  CDRL  item  A008  of  Contract  Number  F0470 1 -7  5-C -024  1 . 

The  report  describes  those  circuit  devices  whose  existence  is 
critical  to  the  successful  development  of  the  APSP,  These  include  digital 
converters,  both  D/A  and  A/D,  low-power  mass  memories,  and  a high 
speed  (8M  instructions  per  second)  microprocessor. 

Also  included  are  descriptions  and  discussions  of  risk  associated  with 
the  critical  process  technologies  that  need  to  be  developed  to  permit  fabrica- 
tion of  the  above  devices.  These  include  high  resolution,  i.  e.  sub-micron, 
lithography  using  electron  beams  or  x-rays  in  place  of  light  waves,  and 
siliccn-on-an  insulator  technology. 

Section  seven  of  the  report  contains  schedules  for  the  design,  devel- 
opment and  test  of  several  of  the  critical  devices  or  processes  identified 
earlier. 

The  report  concludes  with  two  appendices  (separate  cover)  containing 
Hughes-proprietary  information  on  the  design  and  performance  analysis  of  a 
company -funded  CCD  A/D  converter  test  chip  (CRC-100). 


Z.  0 ADVANCED  SEMICONDUCTOR  PROCESSING 


The  phenomenal  growth  in  the  complexity  of  silicon  integrated  circuits 
daring  the  nearly  two  decades  of  their  existence  can  not  proceed  indefinitely. 
We  have  seen  an  approximate  doubling  in  the  number  of  components  per 
chip  each  year  to  the  present  level  of  about  10  components  per  chip.  This 
growth  has  resulted  from  a 64-fold  increase  due  to  circuit  design  improve- 
ments, 20-fold  reduction  in  linewidth  by  higher  resolution  lithography,  and 
by  a 12-fold  increase  in  chip  area.  With  the  advent  of  electron  lithography 
during  the  last  few  years,  linewidths  as  narrow  as  0.  045  pm  are  possible, 
(which  exceeds  the  requirements  of  the  APSP)  compared  to  the  average  line- 
width  of  about  5-7  pm  for  present-day  integrated  circuits.  This  would  smug- 
gest that  advanced  lithography  alone  could  carry  us  to  approximately  10 
° 2 

components  per  chip.  However,  Hoeneisen  and  Mead  have  predicted  that 
component  density  growth  may  flatten  out  at  about  10  components  per  chip 
for  silicon  integrated  circuits  because  of  fundamental  device  physics 
limits  of  the  MOS  field  effect  transistor.  That  is,  for  a MOSFET  the  mini- 
mum separation  between  source  and  drain  can  be  no  less  than  about  0.  2 pm, 
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a limitation  imposed  by  source-drain  punch  through  (doping  4x10  cm  ) 
and  gate  oxide  ;50  A)  breakdown.  A more  recent  analysis  by  Klassen 
suggests  slightly  greater  minimum  dimensions  because  of  avalanche  injec- 
tion at  the  drain  interface.  Compared  to  present  integrated  circuits,  even 
this  remaining  100-fold  growth  possibility  is  very  attractive  for  advanced 
satellite  memory  and  microprocessor  system  applications,  and  moreover, 
attainable  by  means  of  electron  beam  lithography,  advanced  circuit  d,esign 
and  advanced  silicon  processing  methods. 
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Lithographies 


Advanced  lithography  and  submicron  device  processing  are  the 
pivotal  technologies  for  APSP  memory  development.  The  high  density  •« 
power  memory  svstems  required  will  depend  heavily  upon  high 
lithography  (electron  beam  and  * ray,  and  successful  Flection  photoli  ' 
graphic  processing  of  state-of-the-art  serial  CCD  memory  chips  and  CMOS 

l°glC  TeTause  high  resolution  lithography  is  so  important  to  the  critical 
a ' s of  the  APSP  ve  next  introduce  several  general  methods  for  improved 
and  discu'ss  image  area/resolution  limitations  of  photolithography 

*r-r* ........ .i».  — * 
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in  advanced  lighography  systems  and  there  is  even  some  limite  improve 
meat  still  possible  in  projection  lithography  systems  by  going  to  shorter 

wavelength  light  and  smaller  fields.  electron 

The  diagram  in  Figure  2-  1 depicts  the  variety  of  ways  that 

beam  svstems  are  used  for  microcircuit  microfabrication  and  diagnostics. 
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2-1..  Electron  beam /microelectronic  device  technology. 
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Electron  beam  diagnostic  techniques,  shown  in  the  far  right-hand  column 
and  only  briefly  mentioned  here,  are  absolutely  indispensable  in  electron 
beam  lithography.  Scanning  electron  microscopy  (including  compositional 
analysis)  is  the  only  way  to  examine  critically  devices  and  circuits  with 
passive  and  active  regions  that  are  of  the  order  of  one  square  micrometer 
or  less  in  surface  area. 

The  utility  of  direct  electron  beam  device  exposure  lies  in  experi- 
mental device  fabrication  for  design  checkout  and  in  the  fabrication  of  the 
highest  performance  (low  volume)  specialty  devices  and  circuits.  If  device 
yields  prove  to  be  high,  direct  electron  beam  fabrication  may  be  cost  effec- 
tive without  replication  for  certain  devices  and  circuits.  Full-field  replica- 
tion by  transmission  x-ray  lithography  will  probably  form  a viable  basis  of 
a batch  production  process. 

Image  A rea /Resolution  Requirements 

Electron  beam/x-ray  lithography  offer  considerable  growth  potential 
for  higher  resolution  while  convential  projection  photolithography  ha  s 
very  little  room  for  growth. 

Because  device  and  IC  performance  (and  probably  also  fabrication 
yield)  can  benefit  considerably  from  the  use  of  higher  resolution  litho- 
graphy, let  us  examine  the  requirements  that  such  device  fabrication  places 
on  the  lithographic  technique.  Figure  2-2  diagrams  one  aspect  of  these 
requirements,  viz.  , the  pattern  area  coverage  that  is  necessary  in  order  to 
fabricate  various  classes  of  devices,  together  w’ith  the  limits  of  photo-  and 
electron-beam  lithography.  The  right-hand  curve,  marked  PH,  indicates  the 
maximum  area  that  can  be  covered  at  a given  resolution  (minimum  linewidth) 
by  the  best  presei.Uday  optical  and  projection  techniques.  The  region  to 
the  right  of  this  line  is  accessible  by  photolithography,  except  that  the 
upper  limit  shown  has  been  approached  only  in  careful  R&D-type  work.  Pro- 
duction design  standards  are  typically  5 pm  over  2 in.  The  left-hand  curve, 
marked  EB,  denotes  the  approximate  limit  of  electron  beam  lithography, 
drawn  to  represent  a scan  field  of  2 mm  x 2 mm  with  0.  1 pm  resolution. 
Above  2 mm  the  resolution  limit  is  constant  at  0.  1 pm,  due  to  the  estimated 
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Figure  2-2.  Image  area  and  resolution  requirements 
for  advanced  high  resolution  IC  fabrication. 


minimal  errors  in  the  step-and- repeat  process  that  is  required  to  obtain 
areas  larger  than  2x2  mm. 

The  approximate  pattern  areas  required  by  discrete  devices  and 
integrated  circuits  are  shown  shaded,  with  a number  of  specific  devices  that 
have  been  built,  shown  by  the  points.  In  production,  these  discrete  devices 
would  be  made  in  wafer- size  lots  and,  therefore,  require  lithography  capabil- 
ity with  correspondingly  larger  area,  as  shown  by  the  upper  shaded  regions. 
The  dashed  diagonal  line  represents  about  the  largest  single  electron 
beam  scan  field  that  appears  practical  in  the  near  term,  due  to  beam  deflec- 
tion errors,  electronic  stability  of  the  total  system,  and  electron  optics 
aberrations.  It  is  clear  that  the  regions  of  the  figure  representing  the  most 
interesting  and  highest  performance  devices  and  circuits  of  the  future  are 
inaccessible  by  photolithography;  this  technology  is  being  pushed  to  its  prac- 
tical limit.  However,  these  regions  are  well  within  the  boundaries  of 
electron  beam  lithography.  This  is  an  important  point,  because  it  indicates 
that  the  electron  beam  technology  will  not  be  used  at  the  extremes  of  its 
capability,  and  that  has  favorable  implications  for  yield. 

The  triangles  on  the  chart  represent  acoustic  surface  wave  delay 
lines  made  at  IBM  and  Hughes.  The  diamonds  represent  MOS  circuits  fab- 
ricated using  electron  beam  lithography,  and  the  circle  2 represents  Hughes 

1 

very  high  resolution  electron  beam  resist  work. 

The  main  point  here  is  that  the  electron  beam  lithography  permits 
patterns  with  linewidths  as  narrow  as  0.  05  pm  to  be  exposed.  In  light  optical 
techniques,  pattern  linewidth  is  limited  by  diffraction,  scattering  and  inter- 
ference effects.  The  narrowest  line  possible  by  photolithographic  techniques 
is  about  0.  4 pm,  using  a comfortable  mask  to  provide  virtually  zero  spacing 
between  mask  and  resist.  In  practice,  linewidth  values  less  than  about  one 
micron  are  very  difficult  to  achieve  by  contact  or  projection  photolithography. 

Electron  Beam  Lithography 

Electron  beam  lithography  is  a maskless  process  utilizing  both 
positive  and  negative  resists  that  when  combined  with  other  beam  processes 
offers  submicron  three-dimensional  device  'tailoring.  " The  way  in  which 
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a single  electron  beam  is  used  in  microelectronic  fabrication  (refer  to 
Figure  2-  3 ) is  to  create  a suriace  mask  in  resist  on  the  substrate.  This 
resist  pattern  is  then  used  in  any  of  the  subsequent  fabrication  processes 
that  requires  pattern  definition. 

The  basis  of  this  technology  is  a finely  focused  electron  beam  that  is 
deflected  over  a surface  and  blanked  under  digital  computer  control.  The 
electron  beam  exposes  the  resist  where  it  strikes.  Subsequent  development 
will  either  remove  the  exposed  part  (positive  resist)  or  remove  the  unexposed 
part  (negative  resist).  If  this  resist  pattern  is  on  the  device  being  fabri- 
cated, it  can  then  be  used  directly  for  any  of  the  subsequent  fabrication  pro- 
cesses, as  diagrammed  in  Figure  2-3.  This  process  is  used  for  those 
classes  of  devices  requiring  the  highest  resolution,  or  for  low  volume  work 
(e.g.  , R&D)  where  replication  of  production  quantities  of  identical  patterns 
is  not  required.  Alternatively,  this  pattern  generation  technique  can  be 
used  to  create  a noncontacting  replication  mask,  which  can  be  used  either  as 
a shadow  mask  for  x-ray  or  light  exposure  on  the  device  substrate,  or  as 
the  image  from  which  electrons  are  focused  onto  the  final  resist  by  a suit- 
able large-area  electron  optical  system.  Basically,  the  goal  is  to  deflect, 
as  rapidly  and  accurately  as  possible,  a well-focused  high  current  electron 
beam  in  a randomly  addressable  manner  within  as  large  an  area  (field)  as 
possible;  then  mechanically  to  step  and  repeat  this  exposure  field  with  an 
accuracy  of  0.  1 pm  independently  of  or  contiguously  with  similar  fields  over 
the  entire  substrate  field,  which  may  be  as  large  as  a 3-inch  diameter 

silicon  wafer. 

Hughes  Programs  in  High  Resolution  Lithography  (Brief  Summary) 

For  the  past  seven  years,  Hughes  Research  Laboratories  has  carried 
out  an  R&D  program  on  electron  beam  lithography.  We  presently  have 
three  Cambridge  scanning  electron  microscopes  capable  of  microfabr ication 
and  diagnostics.  Each  of  these  machines  is  under  dedicated  closed-loop 
minicomputer  control  for  microfabrication;  one  of  the  instruments  has  been 
modified  extensively  for  higher  speed  lithography.  This  facility  is  housed 
in  a clean  room  in  which  most  of  the  other  key  processes  of  silicon  IC 
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Figure  2-3.  Electron  beam  lithography  and  beam 
microfabrication  technology. 
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processing  can  be  accomplished  without  transfer  of  the  devices  through  a 

potentially  contaminating  environment. 

We  also  have  a program  in  high  resolution  replication,  deluding  both 
the  x-ray  and  conformable  mask  methods.  These  processes  are  also  housed 

in  a clean  environment. 

The  main  elements  of  these  programs  and  the  evolutionary  growth  of 
our  effort  are  summarized  in  Table  2-1.  Two  examples  of  this  high  resolu- 
tion microfabrication  which  are  applicable  to  submicrometer  electrodes  and 
gaps  for  CCD  arrays  is  shown  in  Figure  2-4.  The  aluminum  metallization 
pattern  is  for  a 16-bit,  3-phase  shift  register,  where  the  single  level  metal 
electrode  separation  is  only  0.  3 pm.  More  recently  0.  6 pm  linewidths  have 
been  obtained  in  polysilicon  films  for  2-level  polysilicon  gates.  The  resist 
pattern  with  460  A lines  on  0.  5 8 pm  centers  speaks  for  itself.  This  work  is 
aimed  toward  advancing  the  state  of  the  art  in  device  performance  to  provide 
means  for  improving  our  advanced  electronic  systems.  The  submicrometer 
lithographic  systems  presently  in  use  and  still  under  development  represent 
three  overlapping  generations  of  high  resolution  lithography. 

The  earliest  system  is  a scanning  electron  microscope  (SEM)  which 
is  used  primarily  for  diagnostics  and  for  prototype  device  exploration  and 
system  development.  The  second  electron  beam  system  under  development 
is  considerably  more  advanced  and  is  based  on  our  experience  with  the 
SEM  system.  The  present  x-ray  system  is  serving  the  purpose  of  system 
development  and  prototype  device  exploration  via  x-ray  replication.  We 
describe  in  the  following  pages  the  principal  features  of  these  experimental 
lithographic  systems  and  the  results  obtained  to  date  on  these  programs. 
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TABLE  2-1.  SUB  MICRON  DEVICE  LITHOGRAPHY  AND 
PROCESSING  AT  HUGHES 


R&  D Task s 

Specific  Topics 

Results 

Direction  Electron  Beam  Fal 

Dr  ic  at  ion 

Device  Processing 

EB  Resists  (PMM,  PGM,  PVP) 

0.  1 pm  Exposed  and  developed 
lines  (Ref.  1 ) 

Pattern  Registration 

Demonstrated  ±0.  1 pm  over 
1 mm  x 1 mm  (Ref.  4) 

Combined  Beam  Processes 

JFET  fabricated  by  using  implan- 
tation, EB  lithography,  and 
sputtering  (Ref.  5) 

Plasma  Etch/Strip 

Used  to  fabricate  submicromr  ter 
structures  in  polysilicon 

Device  Fabrication 

Surface  Acoustic  Wave  Filters 

BW  560  MHz:  f =1.3  GHz 
(Ref.  6,  7)  c 

Integrated  Optics  (Guides/ 
Couplers) 

1 pm  guides:  3600  A gratings 
(Ref.  8,  9) 

SBFETs  (GaAs,  Si) 

Under  development  for  x-hand 
(Ref  10) 

MOSFETs  and  CCDs 

0.5  to  1.0  pm  gate  lengths  under 
development 

New  EB  System 

Deflection  Coils,  Amplifier 

Fabricated,  under  test 

Electrostatic  Beam  Blanking 

Fabricated,  under  test 

Laser  /Computer-Controlled 
Stage 

Fabricated,  under  test 

Software  /Firmware 

Pattern  Generation  \ 

Registration  ( 

Real-Time  Process  Control  l 
System  Diagnostics  J 

For  devices  and  processes  listed 
above  (Ref.  6) 

Replication 

Conformable  Glass 

Mask  Fabrication  by  EB 
Lithography 

Chrome  on  thin  glass  with  0.6  pm 
line  widths 

Pattern  Replication 
Contact  Fixtures  Resist 
Proces  sing 

Submicron  lines 
Made  in  positive 
Photoresist 

Device  Fabrication 
Substrate  Gleaning  A1 
Liftoff 

SAW  pulse  compression  filter 
fabricated  w ith  635  fingers, 

0.  6 pm  wide  in  each  of  t'*"* 
transducers 

X-Ray 

Mask  Fabrication 
Silicon  (by  Contact  Photo- 
lithography and  EB 
lithography 

Gold  on  2 pm  silicon,  2.  5 pm 
linewidths  accomplished 
(Ref.  11).  0,  6 pm  linewtdths 

fabricated  (Ref.  12) 

Mylar 

In  process 

Pattern  Replication 

Good  results  in  PMM  and  metal 
acrylate.  Fair  results  in 
KM  NR  747  (Ref.  13) 

Registration 
Piezoelectric  Stage 
Servo  Electronics 
Alignment  Mark  Evaluation 

130  A/V  sensitivity,  10  pm 
range,  100  Hz  bandwidth  built 
and  te  sted 

Device  Fabrication 
SAW  Device 
Microwave  FET 

In  process 
Under  development 

Examples  of  Devices 

A number  of  devices  have  been  fabricated  by  electron  beam  techniques 
Ion  implanted  junction  FET  switches  were  fabricated  first  at  Hughes  by 
electron  beam  lithography  and  ion  beam  sputtering.  Patterns  in  positive 
electron  resist  (PMMA)  were  used  to  define  areas  in  the  underlying  metal 
layer  that  was  subsequently  ion  beam  sputtered  and  removed.  Then  the 
metal  with  the  sputter  etched  pattern  was  used  as  an  ion  implantation  mask. 
Figure  2-5  shows  the  device  configuration.  The  width  of  the  extended  source 
drain  region  which  was  implanted  is  1 pm.  In  subsequent  fabrication  this 
width  was  reduced  to  0.  4 pm.  This  device  was  developed  initially  as  a high 
conductance,  low  switching  power  microwave  switch. 
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Both  MOS  FETS  and  CCD  shift  registers  are  under  study.  The 
Hughes  electron  beam  microfabrication  work  on  acoustic  transducers  has 
largely  involved  work  in  the  1.  0 GHz  region  on  lithium  niobate,  and  as  such, 
has  not  demanded  ultimate  resolution  (typical  metal  lines  were  only  0.  6 pm 
and  larger  in  width).  Some  recent  work  has  been  in  the  area  of  pulse- 
compression  filters,  an  example  of  which  is  shown  in  Figures  2-6  and  2-7. 
One  of  the  transducers  making  up  the  filter  is  shown  in  Figure  2-6.  The 
transducer  has  634  electrodes  covering  a span  of  0.  892  mm.  Electrode 
widths  are  5000  A.  Wiath  is  held  constant  to  within  8%  over  the  entire  array 
by  employing  specially  developed  process  control  software  which  determines 
optimum  exposure  conditions  as  a function  of  adjacent  electrode  spacing. 
Figure  2-7  shows  the  insertion  loss  for  the  filter  as  a function  of  frequency. 
The  center  frequency  is  1.  3 GHz  and  the  bandwidth  is  5 00  MHz. 


X-Ray  Lithography 

The  x-ray  replication  technique  is  shown  in  Figure  2-8  and  is  similar 
to  contact  microradiography.  A mask  consists  of  a semitransparent  substrate 
on  which  the  desired  pattern  exists  in  a thin  film  highly  absorbing  to  x rays 
Electron  beam  lithography  is  used  to  generate  these  high  resolution  mask 
patterns.  The  mask  is  placed  close  to  a wafer  coated  with  a radiation- 
sensitive  polymer  film.  A distant  "point"  source  of  x-rays,  produced  by  a 
focused  electron  beam,  illuminates  the  mask,  thus  projecting  the  shadow 
of  the  x-ray  absorber  onto  the  polymer  film.  This  is  the  only  feasible  expo- 
sure scheme  since  efficient  x-ray  lenses  and  mirrors  for  collimation  purposes 
have  not  yet  been  developed.  The  finite  size  of  any  real  x-ray  source  leads 
to  some  blurring  of  the  image,  as  illustrated  by  the  insert  in  Figure  2-8. 
However,  the  mask-to-wafer  spacing  s,  the  source  diameter  d,  and  distance 
D can  always  be  chosen  so  that  6 is  sir  on  oared  with  the  minimum  line- 
width  to  be  replicated.  Limitations  of  c nv-  oral  photolithography,  such 
as  diffraction  and  reflection,  generally  can  be  neglected  since  \ s 10  A; 

0.25  pm  lines  can  be  resolved  for  s as  large  as  60  pm  (~2  mil). 

Advantages  of  this  approach  include: 


Large  area  parallel  exposure 

Mask  fabricated  by  electron  beam  lithography 
0.  1 pm  resolution 
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Figure  2-8.  Details  of  x-ray  lithography. 


Off  contact  exposure  (10  pm) 

Insensitive  to  dust  and  low  atomic  number  contammation 
Use  of  positive  or  negative  resists 
Uniform  exposure  with  depth 

No  requirement  to  place  the  mask  or  substrate  in  vacuum 


X-Ray  Mask 

Proper  construction  of  the  mask  is  the  key  to  x-ray  lithography. 

The  substrate  for  the  mask  must  transmit  a reasonable  fraction  of  the  x-rays 
and  yet  be  self-supporting  over  the  pattern  area.  Single  crystal  silicon 
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membranes  have  been  fabricated  by  the  procedure  shown  in  Figi  :e  2-9. 

X rays  are  efficiently  attenuated  only  through  absorption,  a property  that  is 
very  sensitive  to  material  and  wavelength.  A gold  absorber  has  been  used 
to  block  x rays  from  an  aluminum  target  (8.  34  A wavelength).  As  shown  in 
Figure  2-9,  electron  beam  lithography  is  used  to  create  the  high  resolution 
pattern  in  resist  and  then  the  pattern  is  ion  beam  etched  into  the  gold. 

X-Ray  Resists 

All  of  the  same  characteristics  listed  above  for  electron  resists  are 
required  for  x-ray  resists.  In  addition,  because  the  resist  film  only  absorbs 
a small  fraction  of  the  total  incident  x-ray  energy,  it  is  necessary  to  effect 
greater  absorption  within  the  resist.  Toward  this  end,  Hughes  effort  to 
incorporate  heavy  metal  atoms,  either  as  direct  additives  to  the  monomers 
or  as  a soluble  metal  chelate,  looks  extremely  attractive. 
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Figure  2-9.  High  resolution  gold  patterns  fabricated  by  electron 
beam  lithography  and  ion  beam  etching,  on  a thinned  silicon  mem- 
brane x-ray  mask. 
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3.0  ADAPTIVE  SIGNAL  ENCODER  DESIGN 


The  Adaptive  Signal.  Encoder  (ASE)  functional  bLock  diagram  'is 
illustrated  in  Figure  3.  0-1.  The  ASE  utilize t.  a predictive  feedback  tech- 
nique to  encode  analog  detector  signaLs  into  10  bit  digital  words.  In  addi- 
tion, the  ASE  detects  and  erases  samples  affected  by  nuclear  events  and 
provides  adaptive  selection  of  proper  operating  modes  for  the  MFPA  and 
programmable  spectral  filter.  ASE  outputs  are  normalized  to  correct  for 
responsivity  variations  within  the  MFPA,  and  provision  is  made  for  periodic 
automatic  calibration. 

In  the  following,  each  major  functional  block  is  described  in  detail, 
and  critical  device  considerations  are  discussed. 

3.  1 DUAL  RANGE  A/D  CONVERTER 

As  shown  in  Figure  3.  1-1,  the  analog  difference  signal  is  first  com- 
pared with  an  analog  range  threshold,  which  is  usually  A/32,  where  A is  the 
largest  signal  magnitude  in  the  dynamic  range.  The  comparison  result  will 
be  latched  by  a flip-flop  which  is  reset  for  each  word.  If  the  analog  differ- 
ence signal  is  greater  than  A/32,  switches  and  select  the  upper  chan- 
nel and  the  original  signal  is  encoded  by  a 5 bit  A/D  converter.  The 
encoder's  digital  output  then  contains  only  the  first  5 MSB's  with  the  5 LSB  s 
zero.  If  the  analog  difference  signaL  is  less  than  A/32,  the  lower  channel 
is  selected.  The  signal  is  amplified  by  a factor  of  32,  then  encoded  by  the 
5 bit  A/D  converter.  A 5 bit  shift  register  is  placed  after  the  A/D  conver- 
sion; thus  the  first  5 MSB's  are  zero,  only  the  5 LSB's  contain  the  signal 
information.  The  frequency  of  the  incoming  anaLog  signal  is  164  K samples/ 
second  (for  a 10  Hz  MFPA  frame  rate)  and  the  output  data  rate  is  1.  64  M 

bits/sec. 


il 


9 r 


Figure  3.  1-1.  Two  range,  5 bit  A/D  converter. 

It  can  be  seen  that  this  encoding  scheme  needs  only  a 5 bit  A/ D 
converter,  yet  avoids  saturation  when  a relatively  large  difference  signal 
is  present.  For  a large  difference  signal  (>A/32),  resolution  is  limited  to 
A/32;  however  for  typical  inputs  (<A/32),  a finer  resolution  of  A/1024  can 
be  obtained.  In  effect,  five  bit  (32  level)  resolution  within  any  of  32  ranges 
(5  bits)  is  obtained.  The  offset  bias  is  used  to  accommodate  fat  zero  and  to 
shift  the  operating  potential  to  allow  for  both  positive  and  negative 
differences. 

The  converter  of  Figure  3.  1-2  consists  of  a differential  amplifier, 
a flip  flop,  an  analog  (X32)  amplifier,  a 5 bit  A/D  converter,  a 5 bit  shift 
register,  and  switches.  To  perform  efficient  feedback  when  a large  differ- 
ence signal  is  present,  it  is  important  to  encode  the  signal  with  higher 
accuracy  than  the  A/D  resolution  indicates.  Thus  7 bit  accuracy  is  required 
for  the  5 bit  A/D  converter.  CCD  A/D  converters  appear  suitable  for  this 
application  and  are  discussed  in  Appendix  A and  analyzed  in  Appendix  B. 

It  should  be  pointed  out  that  the  analog  range  threshold  input  can  be 
avoided  if  two  5 bit  A/D  converters  are  used  as  shown  in  Figure  3.  1-2. 

Here  the  power  consumption  may  be  higher  since  there  are  two  A/D 
converters. 


DIGITAL 

DIFFERENCE 

SIGNAL 


Figure  3.  1-2.  Different  approach  for  dual  range  A/D  converter. 


3.2  PROGRAMMABLE  PREDICTOR 

The  programmable  predictor  performs  a polynomial  data  fit  using 
n previous  frame  samples 

p = E ak  s k 

k=0 

As  shown  in  Figure  3.2-1,  the  signal  enters  a series  of  (three  shown)  164  K 
shift  register  memories.  The  required  sampler  for  polynomial  prediction  are 
obtained  by  tapping  the  shift  register  memories  at  appropriate  points.  The 
coefficients  aR  are  programmable.  Table  3.  2-1  lists  typical  values  of  aR 
for  1st  order  to  4th  order  predictions.  Simulations  have  shown  all  four  con- 
figurations are  stable.  After  multiplications  and  summations  the  predicted 
value  is  sent  through  a 1 64  K shift  register  to  the  D/A  converter  and  other 
feedback  networks.  cations  are  that  first  order  prediction  will  be  satis- 

factory. The  coefficients  are  programmable  by  the  APSP  controller. 

The  shift  register  memory  is  used  to  provide  inputs  for  MFPA  gain 
control  and  nuclear  event  discrimination.  In  case  a nuclear  event  is  detected, 
the  erase  pulse  enables  the  switch  to  ignore  the  output  from  the  memory, 
which  is  a contaminated  sample,  and  to  replace  it  with  the  uncorrupted  output 
from  the  A/D  converter.  This  is  possible  because  a nuclear  event  is 
assumed  to  be  an  isolated  saturation  signal  (see  next  subsection). 
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The  de  ' gn  of  an  optimum  predictor  requires  a knowledge  of  the  input 
statistics.  Here  the  input  process  is  not  well  defined  and  for  that  reason  the 
programmable  predictor  is  the  result  of  a deterministically -oriented  design 
using  the  Newton  interpolation  (backward  difference)  technique  which  gives 
rise  to  binomial  coefficients.  As  further  information  about  the  input  charac- 
teristics is  gathered,  it  becorr  :s  feasible  to  adapt  the  coefficients  to  obtain 
the  best  match  to  the  input  unaer  a minimum  mean  square  error  criterion. 
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Note  that  a large  number  of  shift  register  memories  is  required  in 
this  functional  block  unless  first  order  differencing  is  utilized  for  the  predic- 
tive encoder.  Because  of  the  serial  nature  of  the  data,  CCD  memories  appear 
to  be  suitable  for  the  application.  Also  since  the  input  signal  is  of  serial 
type,  a serial-parallel  multiplier  appears  applicable,  which  requires  only 
three  full  adders  in  this  case.  The  data  rate  is  1.  64  M bits/eec  as  deter- 
mined by  the  A/D  conversion.  The  two  range  5 bit  A/D  is  more  produceable 
than  a 10  bit  A/D  and  is  expected  to  consume  approximately  60%  of  the  power 
required  for  a 10  bit  A/D  converter. 

3.  3 TEN  BIT  D/A  CONVERTER 

A CCD  D/A  converter  using  a charge  division  technique  will  be  dis- 
cussed in  Appendix  A and  analyzed  in  Appendix  B.  An  extension  of  the 
present  design  (8  bits)  to  a 10-bit  device  is  required. 

3.  4 GAIN  CONTROL  AND  NUCLEAR  EVENT  DISCRIMINATOR 

In  Figure  3.  4-1  a gain  control  and  natural  nuclear  event  (i.  e.  , gamma 
induced  radiation)  discrimination  system  is  shown  for  a first  order  differenc- 
ing predictor.  Two  consecutive  frames,  Sfc  and  St  j are  compared  with 
saturation  and  lower  thresholds  both  of  which  can  be  programmed.  The 
criteria  for  nuclear  event  detection  and  saturation  conditions  are  as  follows: 

1.  The  nuclear  event  erase  will  be  issued  if  an  isolated  frame 
exceeds  the  saturation  threshold  (ST).  That  is,  Sf-1  > S,  and 
St,  St_2  < ST. 

2.  A detector  is  saturated  if  the  last  two  consecutive  frames  exceed 
ST.  That  is  St  and  St_!  < ST. 

The  nuclear  event  erase  pulse  enables  the  switch  at  the  output  of  the 
164  K memory  to  ignore  the  memory  output,  which  is  a contaminated  sample, 
and  to  replace  it  with  the  uncorrupted  one  directly  from  the  converter.  As 
shown  in  the  figure,  the  numbers  of  single  detector  saturation,  horizontally 
spreading  saturation  (3  adjacent  detectors  in  a row)  and  vertically  spreading 
saturation  (3  adjacent  detectors  in  a column)  are  counted  and  reported  to  the 
controller.  In  addition,  the  number  of  detector  samples  above  the  lower 
threshold  and  the  peak  value  in  a frame  are  reported  for  the  purpose  of 
optimal  operating  mode  selection  to  maximize  the  signal  to  noise  ratio.  These 
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Figure  3.  4-1.  Gain  control  and  nuclear  event  discrimination. 


are  accomplished  by  the  adaptive  dynamic  range  control  algorithm  prc 

grammed  in  the  A PSP  controller. 

It  can  be  seen  that  the  circuits  involved  here  are  standard  digital 
logic.  A 16.  4 K shift  register  memory  is  needed  to  perform  the  discrimina- 
tion logic;  again  a CCD  register  appears  suitable.  It  should  be  noted  that  no 
additional  memory  is  required  since  the  same  memory  serves  the  predictor 
and  this  functional  block. 


3.  5 RES  PONS  IVITY  CALIBRATION/NORMALIZATION 


As  shown  in  Figure  3.  5-1,  the  full  amplitude  (or  difference)  signal 
is  normalized  by  multiplying  it  with  the  corresponding  calibrated  respons.vity 
coefficient  in  the  CCD  memorv,  using  a hard-wired  serial/parallel  10  bit 
multiplier.  The  coefficients  in  the  memory  can  be  updated  by  invoking  a 
calibration  mode  and  correcting  for  the  calibration  source  nonuniformities. 


Figure  3.  5-1.  Responsivity  calibration/ normalization. 

3.  6 DIGITAL  CONVERTER  STATUS 

The  purpose  of  this  discussion  is  to  summarize  the  current  state  of 
the  art  in  Analog  to  Digital  and  Digital  to  Analog  converters.  Units  with 
special  features  are  out  into  perspective  by  the  use  of  a comparison  table. 
The  major  emphasis  is  on  total  systems,  therefore  subsystem  building 
blocks  such  as  sample  and  hold  circuits,  precision  ladders,  etc.  are  not 
discussed.  This  survey  of  Analog-to- Digital  Converters  (A/D's)  and 
Digital-tc -Analog  Converters  (D/A's)  includes  data  from  "Technology 
Survey  for  Adaptive  Programmable  Signal  Processor"  CDRL  item  A007, 
previously  submitted.  However,  the  present  survey  covers  about  2-12 
times  as  many  devices  as  were  treated  in  the  corresponding  portion  of 

A007. 

Figure  3.  6-1  is  a graphical  representation  of  A/D  resolution  vs 
speed.  Superimposed  over  this  data  is  an  envelope  of  the  data  gathered 
by  Lancaster* 1}.  The  trends  are  similar  but  higher  performance  devices 
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SAMPLE  SIZE  (SITS) 


e presented  in  this  (more  recent)  survey.  Figure  3.6-2  depicts  D/A's 
milarly,  while  Figures  3.6-3  and  3.6-4  show  conveision  energy  vs  resolution 
:veral  interesting  relationships  can  be  highlighted  by  observing  envelopes 
Figures  3.  6-1  through  3.  6-4. 


a. 


Sample  size  vs  sample  rate. 

1.  The  trends  for  A/D's  and  D/A's  are  almost  identical. 

2.  The  envelopes  are  alsovery  similar 


rHkely  to-rt 

performance.  Compatible  speed  and  resoiut.cn  requ.re- 
ments  seem  to  be  driving  both  designs. 


b. 


Energy  per  conversion  vs  resolution:  . 

i.  The  development  trends  are  similar, , but  A/^s  re,u, re  1°  to 
30  times  more  energy  per  convers.on  than  D,  A 
same  resolution 


2.  The  envelopes  show  about  the  same  amount  of  spread  in  terms 
nf  var  iabilitv  of  performance. 


D/A  CONVERTERS 


SAMPLING  RATE 


Figure  3.6-2.  D/A  converter  resolution  vs  speed. 
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Figure  3.  6-4.  Conversion  energy  vs  resolution 
for  D/A  converters. 

This  information  is  useful  in  system  configuration  tradeoffs,  and  indicates 
the  benefits  of  designs  utilizing  a large  number  of  low  resolution  devices 
instead  of  one  high  resolution  device;  e.  g,  , One  11  bit  A/D  typically  requires 
as  much  energy  as  ten  6 bit  machines.  For  encoder  designs,  concepts 
favoring  high  resolution  D/A's  and  low  re  solution  A/D'  s would  be  preferred, 
if  high  encoder  resolution  is  required. 

Tables  3.6-1  and  3.6-2  present  only  linear  converters.  Although 
other  types  have  significant  advantages  in  different  applications,  they  are 
nof  appropriate  for  the  ASE  task.  An  example  is  the  Precision  Monolithics1 
companding  D/A  that  follows  standard  nonlinear  speech  compression  laws. 

One  outstanding  new  linear  device  is  the  Hughes  4 bit  monolithic  A/D 
encoder  which  has  been  operated  with  a 2.  5 nsec  converter  time;  by  com- 
bining four  of  these  devices,  a 6 bit  word  can  be  generated.  This  device 
dissipates  1.4  watts.  A second  device  is  a 6 bit  monolithic  D/A  converter, 
which  ronverts  in  6 nsec  and  dissipates  0.  7 watt. 
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TABLE  3.  6-1. 


Company 

Analogic 
In  tech 

KrU  (IEEE  Symp  Speech! 

Phoenix  Data 

Analogic 

PreBton  Sci 

Analogic 

Computer  Labs 
Analogic 

Datel 

Tele  dyne 


D.  S.  Schover 
Intersil 
In  tech 

Hybrid  Systems 

DMC 

Cycon 

Analogu 

ICL  Data  Device  Corp 
Burr-Brown 

Computer  Lt*bs 
Teledyne 

McCreary  fc  Cray 
Intech 

Hybrid  Systems 
General  Instr 
DMC 

Da  tel 
Cycon 

Burr  - Brown 


1975  A/D  CONVERTER  CHARACTERISTICS 


Model 

Bits 

Time,  pa 

Sample  Rate, 
1/t 

^owe  r , 

pjoulea 

pjoules  / bit 

Size 

MP  8016 

16 

25 

40  kHz 

4.  5 

1 12.  5 

7.  03 

3 x 4.  5 x 0.  35 

A 856-  16 

16 

8 

125  kHz 

16 

20 

50  kHz 

ADC  1216 

15 

10 

100  kHz 

MP  8015 

15 

16 

62.  5 

4.5 

72 

4.8 

3 x 4. 6 x 0. 35 

CMAD2  14 1J 

14 

5 

200  kHz 

MP  8014 

14 

10 

100  kHz 

4.5 

45 

3.21 

3 x 4. 6 x 0.  35 

MP  2914 

14 

10 

100  kHz 

2.  1 

21 

1.51 

2 x 4 x 0.4 

9000 

13 

0.  1 

10  MHz* 

70 

7 

0.  5385 

MP  2913 

13 

10 

100  kHz 

2.  1 

21 

1.62 

2 x 4 x 0. 4 

9000 

12 

0.  1 

10  MHz* 

70 

7 

0.5833 

1 

HY  12BC 

12 

8 

125  kHz 

2 

16 

1. 3333 

EH  12  B3 

12 

2 

500  kHz 

2.  32b 

4.65 

0.  38. '5 

12 

0.  181  x 0.  1 14 

4129  QZ- 

12 

24 

41.7  kHz 

4132 

12 

3.5 

285.  7 kHz 

4133 

12 

2.5 

400  kHz 

12 

5 

200  kHz 

12 

A35  1-  12 

12 

2.5 

400  kHz 

12 

20 

50  kHz 

0.  1 

2 

0.  1667 

2530 

12 

6 

166.7  kHz 

2.6 

15.6 

1.  3 

2724 

12 

18 

55.6  kHz 

3 

54 

4.5 

AD  12Z 

12 

100 

10  kHz 

0.25 

25.5 

2.  12 

AD  12QM 

12 

25 

40  kHz 

2 x 4 x 0. 4 

MP  2712 

12 

4 

250  kHz 

3.3 

13.2 

I.  1 

2 x 4 x 0.  44 

ADH  10/1 

12 

0.8 

1.  250  MHz 

ADC  85 

12 

10 

100  kHz 

0.  45 

4.  5 

0.  375 

ADC  80 

12 

25 

40  kHz 

ADC  60 

12 

3.5 

285.7  kHz 

2.85 

9.975 

0.8312 

9000 

11 

0.  1 

10  MHz* 

70 

7 

0.6364 

7000 

10 

0.  1 

10  MHz 

75 

7.5 

0.75 

10 

0.  181  x 0.  1 14 

4131 

10 

1 

1 MHz 

MOS 

10 

20 

50  kHz 

A35  1-  10 

10 

1.  5 

666.  7 kHz 

ADC  580  LP 

10 

20 

50  kHz 

MEM  5014 

10 

0.  062  x 0.  104 

2520 

10 

5 

200  kHz 

2.  6 

13 

1.3 

2722 

10 

15 

66.  67  kHz 

3 

45 

4.5 

2726 

10 

18 

55.  56  kHz 

3 

54 

5.  4 

HY  10BC 

10 

6 

166  kHz 

AD  10Z 

10 

55 

18.2  kHz 

0.25 

13.75 

1.  37 

AD  10GM 

’0 

22 

45.  4 kHz 

2 x 4 x 0.  4 

ADC  85 

6 

166  kHz 

J 

„ — 



(Continued  next  page) 


(Table  3.6-1,  continued) 


Ayden  Vector 
Analog  Devices 


Compute-  Labs? 
Kindlmann  'IEEE) 

Candy  (IEEE  Lomm-22) 
Analog  Devices 


Micronetworke 


Computer  Labs 
Analogic 


Giri  fe  Maxwell 
1973  Inti  T.  M.  Conf. 


Computer  Labs 
Hughes 


R.E.  Fisher  ( IF. EE 
MTT16  #8) 


Hughes 

Computer  Labs 


1103 

AD  7570J 
ADC  80 
ADC  85 


A 857  8 

HI  0180/0185 

25  10 

2720 

VH8B 

UH8B 

HY8DC 


ADdUM 
VMS  815 
MP  2908 


R.W.  Means  (A/D  Conv 
by  CTD) 

Navy  Case  #56,  17  1 


If 


Sample  Rate,  Power, 

Bits  1 me,  pe  l/r  w pjoulea  pJoules/Bit 


1. 88 

47.6  kHz 
531.9  kHz 

l 

1 MHz 

3 

3.  2 kHz 

> 

40  kHz 

1.  2 

833.  3 kHz 

1.7 

588.2  kHz 

5 

15.4  kHz 

0 

25  kHz 

0.  1 

10  MHz 

0 

11.  1 kHz 

0.5 

2 MHz 

1 

1 MHz 

5 

28.6  kHz 

5 

200  kHz 

M 

250  kHz 

0.75 

1.  3 MHz 

0 

10  kHz 

0 

10  kHz 

1 

1 MHz 

0.8 

1.25  MHz 

5 

40  kHz 

4 

250  kHz 

2 

83.  3 kHz 

0.2 

5 MHz 

0.  1 

10  ML  : 

4 

250  kHz 

0 

4 kHz 

0 

25  kHz 

8 

55.6  khl7 

0, 0666 

15  M 

2 

500  kllz 

0.  05 

20  MHz 

0.  05 

20  MHz 

0.0133 

75  MHz 

0.  005 

200  MHz* 

0. 004  16 

240  MHz* 

0.0025 

400  MHz* 

0.  01 

100  MHz 

0. 0034 

7.5 

0.  008 

0. 120  x 0. 135 

0. 9375 

0.  181  x 0.  114 

124 

'J.  637 

0.  044 

0.625 

0.78  x 1 x 0.  14 

0 625 

0.  1406 

4.  38 

0.  113  x i.  124 

1.  3 

4.  5 

0.207 

0.  103 

2812.  5 

2 x 4 x 0. 4 

208 

0.  525 

2 x 4 x 104 

0.021 

0.85 

0.202 

0.  17  1 

0. 00875 

0.  121  x 0.  164 

0.  034*. 

(Table  3 6-1,  continued) 
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Recently  several  new  types  of  devices  have  appeared  in  the  literature. 

The  more  interesting  of  these  are: 

• An  8 bit,  1 GHz  A/D  converter  based  on  linear  electrocptic  phase 
retardation  ( 6)  utilizing  optical  waveguide  modulators  is  described. 
No  actual  power  or  size  parameters  are  reported. 

• A 5 bit  MOS  monolithic  clockless  A/D  converter  (7)  utilizing  por- 
tions of  a continuously  variable  threshold  device  to  aruieve 
conversion  times  of  2 psec;  power  dissipation  was  r>*v  eported. 

• An  all  MOS  successive  approximation  weighted  capacitor  A D 
conversion  technique  (reference  8).  It  performs  a 10  bit  con- 
version in  20  psec.  The  acquisition  time  is  25  psec;  thus  the 
conversion  rate  is  22  KHz. 

• At  the  same  conference  (8)  R.  B.  Craven  presented  a bipolar 
LSI  12  bit  D/A  converter  consisting  of  a 97  x 180  mil  Si-Cr 
resistor  network,  and  a 79  x 179  chip  of  active  circuitry;  power 
dissipation  was  not  reported. 


1. 

2. 

3. 

4. 

5. 

6. 

7. 

8. 
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4.  0 MEMORY  DESIGN 


n 


The  general  microcomputer  system  shown  in  Figure  4.  0-1  contains 
three  types  of  memory;  a program  storage  memory  (PROM),  a read/ write 
random  access  memory  (RAM),  and  an  information  storage  memory. 
Improvements  in  semiconductor  technology  will  enable  fabrication  of 
smaller  geometry  devices,  resulting  in  higher  packing  densities,  an 
increased  number  of  functions  placed  on  a single  chip,  and  higher  opera- 
ting frequencies.  Improvements  in  microprocessors  must  be  accompanied 
by  corresponding  improvements  in  memory  technology. 

The  information  memories  (pixel  data,  star  data,  etc.  ) consist  of 
large  data  banks,  and  therefore  benefit  from  a serial  (block  addressed) 
organization.  Because  of  the  large  quantity  of  data  in  these  memories, 
primary  consideration  must  be  given  to  power  dissipation  and  packing 


Figure  4.  0-1.  General  microcomputer  system. 
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Figure  4.  1-1.  Micro- photo  of  Hughes  2100  I L chip. 
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been  initiated  to  establish  processing  that  will  yield  4 pm  bit  shift  registers 
with  minimum  lateral  dimensions  of  0.  5 pm.  To  overcome  the  problem  of 
aluminum  grain  size,  aluminum  gates  will  be  replaced  b^  polysilicon. 
Although  it  requires  an  extra  processing  step  for  second  level  gates, 
polysilicon  enables  the  construction  of  much  smaLler  devices.  E-beam 
technology  is  capable  of  decreasing  the  minimum  lateral  dimension  by  a 
factor  of  2 or  3 before  significant  small- geometry  problems  arise.  The 
concern  is  that  small  geometry  reduces  the  bucket  size  and  therefore  the 
amount  of  stored  charge.  The  limited  amount  of  stored  charge  introduces 
threshold  voltage  variation  as  a significant  problem.  This  problem  is  most 
apparent  in  relation  to  refresh  amplifiers  which  then  need  much  more  strin- 
gent threshold  variation  tolerances.  Another  problem  of  small  geometry  is 
that  decreases  in  lateral  dimensions  also  require  decreases  (and  therefore 
require  high  control)  of  vertical  dimensions  (oxide  thickness,  diffusions, 
etc.  ). 

Continual  improvements  in  processing  techniques  indicate  overall 
device  yield  will  increase  to  more  than  10  percent.  Predictions,  based  on 
SPS  design  using  Electron  Beam  lithography  and  optimized  processing,  are 
that  a 320K  bit  serial  memory  capable  of  operating  in  the  20-50  MHz  range 
is  quite  feasible. 

A dual  1 60K  bit  serial  memory  organization  (two  blocks)  is  shown  in 
Figure  4-1.  2 and  provides  for  two  words  of  storage  for  all  pixels  in  one 
MFPA  chip  (i.  e.  , this  is  a block  addressed  memory  where  one  block  is  a 
full  MFPA  word).  Chip  dimensions,  including  peripheral  circuitry,  are 
approximately  100  x 100  mils  and  contain  ten  32K  bit  arrays  in  two  SPS 
blocks.  There  is  one  refresher  per  32K  bit  array.  Two  transfers  are 
needed  per  bit,  therefore  the  charge  transfer  efficiency  (CTE)  necessary 
to  retain  70  percent  of  the  stored  information  before  refresh  is  given  by 

cte2(M+N)  = Q7 

M “ Number  of  rows 
N = Number  of  columns 

Using  a 32K  bit/refresh  design  requires  a CTE  of  only  0.  9995. 
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Figure  4 1-2.  320K  CCD  serial  memory 

(dual  16 OK  blocks). 


Refresher  power  is  given  by: 


PRef  " CRef  V L NA 

C , = total  capacitance  of  refresh  circuitry,  clock  leads 
Ref  and  pad  capacitance 

V = PP  clock  voltage 
f = clock  frequency 
N = number  of  arrays  (10) 

Assuming  four  volt  clocks  operating  at  a frequency  of  2 MHa, 


P : 1,9  mwatt 
Ref 


Transfer  power 


dissipation  of  the  CCD  memory  is  given  by 


P = CV^  f(2N  + M)  N. 
tr 

Assuming  four  volt  clocks  operating  at  a frequency  of  2 MH.  (which  is  con- 
servative  if  advancements  in  threshold  voltage  control  contmue), 
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5.  6 mwatts 


Based  upon  an  output  buffer  power  requirement  of  two  mwatts  and  an  on 
chip  clock  driver  of  overall  50  percent  efficiency  (which  again  is  conserva- 
tive if  CCD  compatible  CMOS  is  developed). 

, = 17  mwatts 
Total 


The  effective  power  delay  product  of  the  design  is  then: 


E 


1 i mwatts 


0.  03  p-J/bit 


( 32 OK  bits  x 2 MHz) 

For  this  application  of  dedicated,  serial  memory  where  there  is  no 
need  for  random  access  and  speed  is  not  a critical  issue,  power  dissipation 
drives  the  design.  Line  addressable  techniques  require  additional  logic  and 
line  dri/e  capability  on  the  common  output  diffusion.  However,  a block 
addressed  SPS  configuration  may  provide  the  optimum  trade  between  power 


and  storage  time. 

Examples  of  I/O  and  refresher  circuitry  are  shown  in  Figures  4.  1-3 
and  4.  1-4.  The  refresher  circuit  is  a floating  diffusion  controlling  an  input 
threshold  gate.  A logic  1 will  bias  VT  sufficiently  to  allow  the  input  poten- 
tial well  to  refill  while  a logic  0 will  prevent  transfer  of  charge. 

Various  storage  techniques  have  been  devised  to  provide  the  refresh- 
ing requirement  when  data  is  not  requested,  although  for  MFPA  pixel  data 
the  memories  will  be  in  continuous  use.  For  star  catalog  data  a standby 
storage  mode  will  be  beneficial  to  reduce  power,  assuming  a typical  stor- 
age time  of  less  than  0.  3 seconds.  The  memory  can  simply  be  recycled  at 
a reduced  clock  rate  to  ensure  regeneration  within  the  CCD  storage  time 
capability.  Regeneration  occurs  in  each  of  five  arrays  so  that  the  1.  64  MHz 
clock  (10  Hz  MFPA  clock)  rate  can  be  reduced  by  a factor  of  eight  without 
additional  chip  complication,  resulting  in  a standby  power  dissipation  of 
about  1.  7 mwatts  per  chip  at  a frequency  of  200  KHz. 

The  read/ write  random  access  memory  (RAM)  is  used  as  a scratch 
pad.  The  main  design  considerations  here  are  access  time  (speed)  and  power 
dissipation.  The  two  potential  candidates  for  a high  speed,  high  density- 
low  power  RAM  are  CMOS/SOS  and  the  recently  introduced  application  of 
CCDs  in  RAMs. 
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The  CCD  RAM  uses  the  basic  CCD  concept  of  minority  carrier 
storage  and  transfer  between  potential  wells  to  provide  a nondestructive 
readout.  Access  and  cycle  times  are  comparable  to  present  MOS  memor- 
ies. The  RAM  consists  of  a monolithic  array  of  the  CCD  unit  cells  shown 
in  Figure  4.  1-5,  connected  in  an  N x N array  as  shown  in  Figure  4.  1-6. 


+8V 


Figure  4.  1-5,  CCD  ram  unit  cell. 
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Figure  4.  1-6.  CCD  ram  array. 


Each  cell  has  two  electrodes,  X and  Y.  A logic  1 is  represented  by  a high 
density  or  minority  carriers  and  a logic  0 by  a low  density  of  minority  car- 
riers in  the  well.  The  potential  wells  are  produced  by  applying  a voltage  of 
8V  to  all  the  X (row)  lines  and  5V  to  all  the  Y (column)  lines.  Thus,  when 
a logic  1 is  being  stored,  most  of  the  minority  carriers  are  beneath  the  X 
electrode  (n  channel  device). 

To  read  data  stored  in  the  CCD  cell,  an  extra  positive  pulse  is  applied 
to  the  Y electrode  (while  disconnecting  all  X electrode  voltage  sources), 
thereby  inducing  a pulse  that  drives  the  X electrode  more  positive.  The 
induced  pulse  will  be  large  if  the  stored  bit  is  a 1,  and  small  if  the  stored 
bit  is  a 0. 

To  write  a logic  1,  the  N diffusion  is  momentarily  biased,  injecting 
minority  carriers  into  the  depletion  region.  To  enter  a logic  0,  both  elec- 
trodes are  grounded  allowing  the  minority  carriers  to  recombine. 

The  CCD  RAM  requires  a sense  amplifier  or  comparator  for  each 
X select  line  (row).  Also,  a periodic  refresh  is  necessary  to  retain  stored 
information.  Although  the  signal  to  noise  ratio  of  the  CCD  RAM  is  adequate 
for  the  APSP  application,  because  of  the  sense  amplifier's  power  consump- 
tion and  the  necessity  of  refresher  circuitry,  a more  likely  candidate  for  a 
high  speed,  high  density,  low  power  RAM  is  CMOS/SOS. 

4.  2 CMOS  RANDOM  ACCESS  MEMORY 

The  CMOS/SOS  RAM  is  a static  memory  consuming  power  only  dur- 
ing the  switching  of  CMOS  gates  and  during  the  write  transient.  Stored 
information  is  retained  indefinitely  without  the  need  for  refresher  circuitry. 
The  introduction  of  E-beam  technology  and  improvements  in  device  yield 
will  greatly  increase  the  capability  of  CMOS  devices  (see  Section  5).  Appli- 
cations of  E-beam  technology  should  result  in  resolution  capabilities  of 
0,  5 pm  and  overall  device  yield  will  probably  reach  10  to  20  percent.  CMOS 
LSI  devices  are  expected  to  operate  above  the  100  MHz  range  with  gate  delay 
times  of  0.  6 nsec  or  less.  Work  is  currently  in  progress  at  the  Hughes 
Newport  Beach  facility  on  4K  RAMS.  Lithography  techniques  enabling  con- 
struction of  64K  RAMS  probably  will  be  available  by  die  early  1980's. 

Access  times  should  be  1-3  times  the  microcomputer  minor  cycle  time. 
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Present  minimum  access  times  for  a CMOS  RAM  are  approximately  3-5 
times  the  minimum  clock  period.  Assuming  a 0.  6 nsec/gate  delay  and 
3 nsec  minimum  clock  period,  minimum  access  times  will  be  approximately 
9 to  1 5 nsec.  Write  times  will  be  approximately  20  nsec. 

The  program  storage  memory  will  be  of  the  same  technology  as  the 
RAM  and  microprocessor  (CMOS/SOS)  and  will  therefore  have  cycle  times 
comparable  to  those  of  the  microprocessor. 

Figure  4.  2-1  shows  a block  diagram  of  a 64K  CMOS/SOS  RAM 
memory.  The  memory  is  organized  in  a 256  by  256  array  with  a 16  bit 
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Figure  4.  2-1.  Block  diagram  of  64K  CMOS  RAM  memory. 
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input/output  format.  The  basic  memory  cell  consists  of  five  CMOS 
transistors,  as  shown  in  Figure  4.  2-2.  The  cell  is  accessed  by  activating 
the  word  select  line  and  sensing  the  logic  state  on  the  bit  line  in  the  read 
mode  or  by  driving  the  bit  line  to  the  desired  logic  state  in  the  write  mode. 
The  256  bit  lines  are  multiplexed  in  groups  of  16  to  provide  the  16  bit  I/O 
format.  The  input/outputs  are  3 -state  bus  lines  connected  to  3 other 
devices.  The  necessary  read/ write  logic  and  timing  will  be  generated  on 
chip  to  minimize  interconnects. 

E-beam  technology  should  make  it  possible  to  construct  the  64K 
memory  on  a 200  mil2  chip  with  good  yield.  Assuming  a 20%  access  duty 
cycle  and  the  microcomputer  operating  at  40  MHz  basic  clock  rate,  the 
memory  is  expected  to  consume  approximately  15  mw  with  a standby  power 
of  less  than  0.  3 mw. 


LINE 

Figure  4.  2-2.  Basic  RAM  memory  cell 
configure  tion. 
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4.  3 SUMMARY  OF  CRITICAL  MEMORY  DEVICE  DESIGNS 

In  summary,  the  memory  technology  likely  to  be  available  in  1982 
can  be  expected  to  meet  the  needs  of  the  APSP.  The  critical  technology 
development  requirements  to  realize  this  capability  are: 

• Lithography  capable  of  about  0,  2 pm  line  widths  over  a 
100  mil  square  CCD  chip  (0,  5 pm  line  widths  over  a 
200  mil  square  CMOS  chip) 

• On  chip  CMOS  CCD  cl  ock  driver  development 

• Threshold  voltage  uniformity  improvement 

• Advances  In  LSI  on  chip  interconnect  technology 

The  criti cal  memo ry  device  characteristics  anticipated  are: 

Block  Addressed  (Serial  Access) 

• Technology  - monolithic  CCDs  (n- surface  channel)  with 
compatible  CMOS  clock  drivers  on  chip. 

• Dual  160K  bit  chip  (320K  bits/ chip)  for  two  MFPA- 
wor  ds 

• Maximum  clccl  rates  >20  MHz 

• Chip  dissipation  - 14  m watts  at  I,  t:4  MHz  clock  10  Hz 
MFPA  frame  rate),  reduced  to  about  4 mwatts  if  2 volt 
clocks  are  feasible 

• Standby  dissipation  -1.7  mwatts  at  200  KHz  clock 
RAM 

• Technology  - CMOS/SOS 

• 64K  bit  chi,., 

• Chip  size  - 200  x 200  mils 

• Maximum  clock  rates  >100  MHz 

• Access  time  ~ 12  nsec,  write  time  ~ 20  nsec 

• Standby  power  dissipation  < 0.  3 mwatt 

• Active  power  dissipation  <15  mwatts  while  operating  with 
a 10  pf  bus  with  the  8 MHz  read/ write  rate. 


5.  0 LOGIC  DEVICE  DESIGN  (CMOS/SOS) 


Advancements  in  integrated  circuit  processing  technology  have  been 
occurring  at  a rapid  rate  over  the  last  few  years,  and  it  appears  that  inte- 
grated circuit  performance  wiLL  continue  to  improve  in  the  years  ahead.  As 
these  improvements  are  realized,  device  speeds  will  increase,  speed-power 
products  will  decrease,  and  the  number  of  devices  per  given  area  of  chip 
real  estate  will  increase.  This  will  enable  the  construction  of  LSI  circuits 
much  more  powerful  than  those  in  use  today.  Micro  processors  operating 
at  clock  frequencies  in  excess  of  100  MHz  with  a speed-power  product  of 
less  than  0.  1 p-j/gate,  and  a gate  density  of  more  than  1000  gates/mm 
should  be  within  the  realm  of  usable  technology  within  six  years. 

Although  there  are  technologies  available  today  for  200  MHz  logic, 
none  operate  close  to  the  speed-power  product  requirements  of  the  APSP. 
What  is  required  is  a technology  that  consumes  little  power  and  has  the 
potential  to  reach  the  speeds  needed.  The  most  promising  is  CMOS, 
because  it  consumes  power  only  while  switching,  and  has  inherently  low 
power  consumption,  while  still  allowing  high  speed  operation.  As  process- 
ing technology  improves,  the  devices  can  be  made  smaller,  which  will 
decrease  charge  transit  time  and  ga'e  capacitance,  with  a corresponding 
increase  in  device  speed.  The  lower  capacitances  will  mean  smaller  device 
currents.  This,  along  with  decreased  operating  voltage,  will  greatly  reduce 
the  power  required  for  CMOS  operation.  CMOS  performance  can  also  be 
improved  by  using  an  insulating  substrate;  parasitic  capacitance  is  reduced- 
causing  a considerable  increase  in  device  speed,  and  device  density  is 
increased  since  isolation  diffusions  are  not  required.  CMOS  silicon  on 
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sapphire  (SOS)  is  presently  being  developed  by  several  manufacturers.  This 
technology  has  a disadvantage  in  that  it  is  difficult  to  grow  silicon  on  sapphire. 
The  silicon  tends  to  have  defects  in  the  crystal  causing  unusually  high  leakage 
currents.  Improvements  in  growing  silicon  have  minimized  this  problem, 
and  further  improvements  can  be  expec  d.  Recent  work  at  the  Hughes 
Research  Laboratories  in  Malibu  has  with  the  possibility  of  using  high 

resistivity  silicon  as  an  insulator.  The  growing  of  low  resistivity  silicon  on 
this  substrate  would  eliminate  the  crystal  defect  problem  (but  creates  a sub- 
strate leakage  problem).  The  probability  of  this  technology  becoming  usable 
in  the  next  few  years  is  high,  and  would  prompt  further  improvements  in 
CMOS  performance. 

These  technological  advancements  will  enable  CMOS  to  operate  with 

delay  times  less  than  0.4  ns,  compared  to  present  CMOS  LSI  delay  times  of 

approximately  5 ns,  and  less  than  0.  02  p-J  power-delay  product  compared  to 

present  LSI  power  delay  products  greater  than  1 p-J  (with  fan  outs  of  4 or  5). 

2 

The  gate  density  will  be  greater  than  1000  gates /mm  for  LSI  random  logic 

2 

compared  to  present  day  LSI  gate  densities  of  approximately  90  gates/mm  . 
These  advanced  CMOS  gates  will  be  used  to  construct  memory  cells  for 
Read  only  Memories  (ROM)  and  Random  Access  Memories  (RAM)  as  well 
as  LSI  for  micro  processors  for  high  speed,  low  power,  and  large  data 
handling  capability  systems. 

Since  the  key  to  producing  CMOS  that  meets  the  criteria  stated  earlier 
is  reducing  device  size,  an  analysis  of  the  problems  involved  follows.  The 
potential  problems  are: 

1.  Approximations  used  in  conventional  device  modeling  do 
not  hold  for  small  geometry  devices. 

2.  Present  photographic  techniques  of  mask  making  and  wafer 
exposing  are  limited  by  the  wavelength  of  light. 

3.  Voltages  .that  can  be  applied  from  drain  to  source  are  limited 
due  to  the  reverse  bias  diode  punch  through  voltage. 

4.  Threshold  voltage  becomes  harder  to  predict  and  control,  and 
more  significant  as  devices  and  applied  voltages  get  smaller. 

These  problems  are  examined  individually  in  the  following  discussion. 
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Swanson  (5-2)  has  done  an  analysis  on  small  geometry  MOS  devices. 

In  conventional  device  modeling,  it  is  assumed  that  there  is  one  dimensional 
current  flow  between  the  drain  and  the  source.  This  assumption  is  no  longer 
valid  for  small  geometry  devices.  Using  a two  dimensional  model  for  source 
to  drain  current,  equations  for  drain  current  (1^)  as  a function  of  gate  and 
drain  voltages  and  pair  delay  time  were  derived  and  are  shown  below? 
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where; 


W = Channel  width  in  cm 
L = Channel  length  in  cm 

2 

(j.  = Electron  surface  mobility  in  cm  /volt-sec 

2 

Cqx  = Oxide  capacitance  inf/cm 

t = MOST  gate  thickness  in  angstroms 

2 

Cdg  = Depletion  capacitance  in  f/cm 

V = Supply  voltage  in  volts 
s 

Using  these  equations,  a graph  of  propagation  delay  time  and  power 
consumption  as  a function  of  L and  Vg  was  derived  and  is  shown  in  Fig- 
ure 5 . 0-1. 

If  a device  were  constructed  with  a 0.  5 pm  channel  length  and  oper- 
ated with  = 0.  8 volts,  from  Figure  5.  0-1  the  propagation  delay  would 
be  0,  1 ns  and  the  power;  delay  product  would  be  0.  0003  p-J.  After  taking 
into  account  that  this  would  be  the  performance  obtained  from  an  ideal  device 
without  loads  and  that  ring  oscillator  performance  would  dissipate  more  power 
and  that  performance  would  be  further  degraded  when  implementing  LSI  with 
the  associated  interconnect  capacitances,  such  devices  still  appear  to  meet 
the  criteria  stated  earlier.  CMOS  noise  immunity  is  high  for  high  supply 
voltages  and  decreases  as  the  supply  voltage  is  lowered.  However,  the 
internally  generated  noise  also  decreases  correspondingly.  Sensitivity  to 
external  and  power  supply  noise  will  require  careful  handling  of  the  filtering 
and  shielding  designs. 

Present  photographic  techniques  are  not  adequate  to  construct  a mask 
with  0.5  pm  line  width.  Typical  devices  today  have  3 to  10  pm  line  widths, 
with  line  widths  as  small  as  1 pm  being  obtained  under  ideal  conditions.  To 
produce  detail  as  small  as  0.  5 pm,  a technology  using  wavelengths  smaller 
than  light  is  required.  X-ray  and/or  E-beam  technology  are  the  potential 
candidates  for  accomplishing  this.  These  methods  are  both  in  the  advanced 
development  stage  and  are  discussed  in  detail  later. 

When  voltage  is  applied  to  the  drain  of  a MOS  transistor,  a reverse 
bias  junction  exists  between  the  drain  and  the  substrate.  A depletion  region 
is  formed  around  the  drain  and  is  a function  of  drain  to  substrate  (and  thus 
drain  to  source)  vo  ' ige.  As  the  source  to  drain  spacing  is  decreased  (or 
the  source  to  drain  voltage  increased)  this  depletion  region  can  extend  to  the 
source.  When  this  occurs,  current  that  is  not  a function  of  the  gate 
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Figure  5.  0-1.  Power-delay  products  showing  ideal  CMOS 
device  limitations  as  a function  of  gate  length  and  device 
voltage  (Adapted  from  Swanson  (5-1)). 
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the  source  to  drain  voltage  increased)  this  depletion  region  can  extend  to  the 
source.  When  this  occurs,  current  that  is  not  a function  of  the  gate  voltage 
will  flow  from  drain  to  source.  This  condition  is  called  punchthrough.  The 
equation  for  punchthrough  voltage  from  a reverse  biased  diode  is  given  as 


PT 
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s o 


where: 

q = Electron  charge  in  coulomb 

hi  = Substrate  doping  concentration  in  carriers/cm 
L = Channel  length  in  cm 
t = Semiconductor  dielectric  constant 
£^  = Permittivity  of  free  space 

Punchthrough  voltage  for  L = 0.  5 pm  and  1.  0 pm  as  a function  of 
substrate  concentration  is  shown  in  Figure  5.  0-2.  For  L = 0.  5 pm  and 

Vg  =0.8  volts,  it  can  be  seen  that  a minimum  substrate  concentration  of 

1 5 

5x10  is  required.  This  is  within  the  range  of  concentrations  used  in 
present  CMOS/SOS. 

One  of  the  critical  aspects  of  short  channel  MOS  is  the  threshold 
voltage  variation.  The  expres sion  for  threshold  voltage  in  long  channel 
MOS  devices  is 
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Figure  5.  0-2.  ^punch  through  Vg  substrate  concentration. 
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where: 


Q 


SS 


= charge  at  oxide- semiconductor  interlace  in  coulomb/ cm' 


Q = fast  surface  state  charge  in  coulomb/cm' 
FS 


fermi  level  referenced  by  mid  gap  in  volt 

3 

N = substrate  concentration  for  carriers/cm 

2 

C = e / 1 in  f/cm“ 
ox  ox  ox 


U.X.  UA  .11  r . 

= MOST  gate  oxide  dielectric  constant,  3.  5 x 10  arad/cm 


ox 


t = oxide  thickness  in  angstro’-  s. 
ox 


From  this  expression,  V^,  is  a function  of  the  substrate  concentration  and 
the  oxide  thickness,  but  should  not  be  a function  of  the  device  width  or  length. 
This  is  not  the  case  however.  As  short  lengths  are  approached,  the  thresh- 
old voltage  drops  (5-2)  as  is  shown  in  Figure  5.  0-3.  The  threshold  voltage 
also  is  sensitive  to  device  width,  but  this  interaction  in  small  geometry 
devices  is  not  well  documented.  From  Figure  5.  0-3  it  can  be  seen  that  the 
threshold  voltage  in  a device  with  channel  length  of  0.  5 pm  will  be  very  sen- 
sitive to  channel  length.  This  could  pose  a potential  problem  in  achieving 
close  tolerances  on  threshold  voltages  in  small  geometry  devices  over  a 
large  chip. 


Figure  5.0-3.  Vt  versus  channel  length. 
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Considerable  research  has  gone  into  the  problem  of  shifting  threshold 
voltages  into  the  ranges  needed  for  low  voltage  devices  (approximately  0.  2 
to  0.  5 volts),  through  t ie  use  of  ion  implantation.  A layer  of  dopant  material 
is  implanted  just  beneath  the  gate  oxide,  with  the  impurity  level  and  polarity 
determined  by  the  desired  shift  in  threshold  voltage.  The  maximum  shift  in 
voltage  is  approximately  5 volts,  more  than  enough  to  move  average  thresh- 
old voltages  into  the  0.  2 to  0.  5 volt  range. 

If  scaling  down  of  present  CMOS/SOS  gate  packing  densities  were  all 
that  was  required  for  small  geometry  devices,  present  densities  could  ' 
increased  more  than  two  orders  of  magnitude.  But  there  are  several  poten- 
tial problems.  Aluminum  interconnects  tend  to  have  defects  that  can  stop 
conduction.  Present  day  technology  limits  aluminum  interconnect  line  widths 
to  1 pm  or  larger.  Polysilicon  can  be  made  in  0.  5 pm  line  widths,  but  poly- 
silicon has  approximately  200  ohms  per  square  resistance  and  long  lines 
would  seriously  degrade  performance.  Fortunately,  the  isolation  areas 
required  for  bulk  CMOS  are  not  required  for  CMOS/SOS,  so  silicon  areas 
can  be  as  close  together  as  lithographic  technology  permits.  A typical  cur- 
rent CMOS/SOS  layout  is  shown  in  Figure  5.  0-4.  This  chip  has  7 pm  chan- 
nel length  transistors  and  a packing  density  of  360  transistors  or  90  gates/ 
mm  . The  minimum  line  width  for  the  aluminum  interconnects  is  approxi- 
mately 3 pm.  By  scaling  down  to  0.  5 pm  channel  length,  the  gate  packing 
density  would  be  14  x 90  - 17,  640  gates/mm2.  However  if  the  aluminum 
interconnects  are  limited  to  1 pm,  the  maximum  packing  density  is  32  x 90  = 
810  gates/mm  . With  the  appropriate  use  of  polysilicon  for  short  inter- 
connects, the  goal  of  more  than  1000  gates/mm2  can  be  met.  For  non-random 
connected  logic  (such  as  RAM),  gate  densities  on  the  order  of  3000  gates /mm2 
are  anticipated. 

Work  now  under  way  at  the  Hughes  Newport  Beach  and  Carlsbad  facili- 
ties may  permit  even  greater  packing  densities.  Methods  of  using  multiple 
layer  metalization  are  being  investigated.  Using  2 layers  of  aluminum  would 
allow  greater  numbers  of  transistors  to  be  interconnected  in  a given  area, 
greatly  improving  the  LSI  design  flexibility.  Both  LSI  interconnect  and  logic 
function  organization  represent  a greater  risk  than  the  actual  device 
capabilities. 
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complete  chip 

4200  TRANSISTORS 


UNIT  CELL 
16  TRANSISTORS 
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CHIP  SIZE  130  x 138  MILS 


Figure  5.  0-4.  CMOS/SOS  256  bit  static  shift  register. 


In  summary,  this  section  has  shown  that  a device  that  can  operate  in 
an  LSI  log;c  circuit  at  clock  rates  exceeding  100  MHz  with  a power  delay 
product  of  less  than  0,  1 p-J  (possibly  <0.  01  p-J)  and  with  a packing  density 
greater  than  1000  gates  /mm  should  be  achievable  within  six  years  with  low 
risk.  Such  device  s^  would  be  CMOS  on  an  insulating  substrate,  probably  sap 
phire,  and  would  use  channel  lengths  on  the  order  of  0.  5 pm.  How  this 
device  would  compare  to  presently  available  logic  is  shown  in  Figure  5.  0-5; 
performance  would  be  more  than  an  order  of  magnitude  better  than  anything 
available  today.  The  critical  technology  development  requirements  to  rea- 
lize this  capability  are 

• Lithography  capable  of  0.  5 pm  line  widths  over  a 200  mil 
square  chip. 

• CMOS  on  insulator  process  development  with  emphasis 
upon  threshold  voltage  uniformity  and  reduction  of  leak- 
age current. 

• Advances  in  LSI  on  chip  interconnect  technologies  and 
logic  organization. 
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Figure  5.  0-5.  CMOS/SOS  design  LSI  performance  expectations. 
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The  critical  CMOS/SOS  gate  device  characteristics  (low  risk) 
anticipated  are: 


• 0.  4 to  0.  8 nsec  LSI  gate  delay 


• 0.  05  to  0.  15  p-J/gate  LSI  power  delay  product 

2 

• >1000  gates/mm  for  random  LSI  logic  (28,  000  gates/chip 
200  mil^  chip  maximum) 


>3000  gates/mm  for  well  organized  (RAM)  LSI  logic 
(84,  000  gates/chip  200  mil2  chip  maximum) 
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6.  0 LOGIC  ARRAYS  AND  FUNCTIONS 


In  the  following  section  two  examples  of  critical  digital  functions  ere 
described  that  need  to  be  implemented  in  order  to  achieve  the  cost/perfox- 
mance  goals  of  the  APSP  program.  The  two  are:  arithmetic  units,  particu- 

larly a high  speed  multiply  function,  and  a micro  processor  implemented  in 
high  density,  low  power  technologies.  Because  of  its  rse  in  the  tracking 
function,  the  latter  has  been  named  the  pPT,  for  micro  processor  tracker. 

To  provide  the  low  power  and  high  spaed  required,  the  APSP  will  be  packaged 
in  a single  hybrid  package. 


6.  1 HIGH  SPEED  MULTIPLY 

Due  to  the  large  number  of  multiply  instructions  expected  in  tracking 
computations,  it  is  desirable  to  have  a dedicated  portion  of  the  pPT's  arith- 
metic unit  composed  of  a high  speed  (parallel)  multiplier. 

Figure  6.  1-1  is  a functional  diagram  of  a standard  expandable  4x4 
multiplier  made  up  of  Full  adders  (FA)  and  appropriate  delays  (t).  This  4 x 
4 array  can  be  stacked  to  accomplish  the  l6x  16  bit  multiply  used  in  the  _i.PT. 

To  implement  the  multipliers,  it  is  necessary  to  develop  techniques 
for  generating  all  the  a^b^  products  at  the  various  times  they  are  needed. 
Using  gate  equivalents  for  a full  adder  and  gate  equivalents  for  a half  adder  it 
is  estimated  the  above  scheme  will  require  approximately  18,000  gate  equiva- 
lents, which  is  within  the  realm  of  feasibility  for  1985  low  power  technology. 


6.  2 APSP  TRACK  PROCESSOR 

The  Track  Processor  is  the  link  between  the  Signal  Processing  ele- 
ments of  the  APSP  and  the  Data  and  Control  Processor  (reference  fig- 
ure 4.  0-1).  The  Track  Processor  receives  filtered  data,  in  the  form  of  "hits" 


'3b1  a2b1  a1b1 


Figure  6.  1-1.  Expandable  4x4  multiplier. 

(defined  as  threshold  excessions,  i.  e.  , potential  targets),  from  the  Signal 
Processor.  These  data  undergo  correlation  with  previous  frame  data  (com- 
monly called  tracking).  Hits  that  appear  to  be  the  logical  continuations  of 
target  tracks  are  used  to  update  those  tracks,  while  those  that  are  found  to 
be  clutter  are  discarded.  Data  describing  all  tracks  currently  being  moni- 
tored is  sent  periodically  to  the  Data  and  Control  Processor.  The  general 
requirements  which  drive  the  pPT  design  are  summarized  below. 

6.  2 1 Requirements 

Throughput  Requirement 

The  computation  rate  for  the  Track  Processor  is  derived  from  the 
expected  hit  rate,,  which  is  estimated  to  be  a maximum  of  1 pixel  of  every 
100  in  the  MFPA  containing  a hit.  For  a single  MFPA  chip  (128  x 128  detec 
tors)  this  implies  at  least  160  hits  per  0.  1 sec  frame  time.  Ass  truing  a 
Poisson  distribution  of  hits  over  the  entire  focal  plane,  p + 3tr  = 200  hits. 
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Given  that  the  frame  time  is  0.  1 seconds,  the  processing  time  for  one 
hit  averages  5 0 msec.  The  number  of  instructions  executed  to  process  one 
hit  is  estimated  to  be  between  2000  and  4000,  thereby  giving  a throughput 
requirement  for  the  pPT  of  4 to  8 MIPS. 

Memory  Requirements  : 

A word  length  of  16  bits  is  sufficient  to  represent  all  measurements 
in  the  APSP,  as  well  as  allowing  overflow  bits  for  arithmetic  operations. 
Sixteen  bits  allow  a powerful  instruction  format.  The  quantity  of  memory  of 
the  pPT  will  be  7K  words,  5K  for  programs,  constants  and  variables,  IK  for 
current  hit  data  and  IK  for  the  alternate  hit  buffer.  The  size  of  the  hit  buf- 
fer is  based  on  the  maximum  number  of  hits  per  frame  being  less  than  200. 

As  described  in  the  processor  architecture  section,  overloading,  i.  e.  , more 
than  200  hits  per  frame,  is  accommodated  by  raising  the  adaptive  threshold 
under  the  control  of  the  pPT.  Each  hit  will  be  transmitted  from  the  Signal 
Processor  as  a data  blocK.  The  size  of  each  data  block  is  estimated  to  be 
4 words.  Thus  the  data  input  from  the  signal  processor  for  one  frame  will 
be  approximately  200  x 4 words,  which  is  less  than  1000  words. 

1/ O Requirements 

The  pPT  will  have  the  following  I/O  functions: 

1.  input  of  hit  data  from  signal  processor, 

2.  output  of  thresholding  and  algorithm  selection  information  to 
the  signal  processor, 

3.  input  and  output  of  target  chip-boundary  crossing  information 
to  and  from  8 neighbors, 

4.  output  track  data  onto  track  bus, 

5.  program  load  from  Data  and  Control  Processor, 

6.  receive  external  interrupts  from  Data  and  Control  Processor. 

The  (J.PT  should  have  dedicated  interfaces  for  each  of  those  I/O 


functions . 


6.  2.  2 Architecture 


i 


Generally,  the  (jlPT  (shown  in  figure  6.  2-  1)  is  a bus-organized  16  bit 
processor,  the  special  features  of  which  are:  128  fast  access  registers 

located  on  the  arithmetic  chip,  16  x 16  bit  fully  parallel  multiply  network 
(2  clock  cycle  full  multiply),  a 7-level  priority  interrupt  structure,  auto- 
nomous I/O  interfaces,  and  a dual  port,  automatically  switching  memory. 

6.  2.  3 Implementation 

Given  the  above  requirements  as  well  as  requirements  for  minimum 
power  and  size,  the  technology  assumed  for  implementation  in  CMOS/SOS, 
(complementary  MOS/Silicon  on  Saphire).  The  two  key  factors  that  set  the 
pPT  apart  from  all  current  and  near  future  micro  processors  are  the  high 
gate  densities  on  the  chip  and  the  very  low  logic  delay  and  small  access  times 
of  memories. 

The  gate  densities  expected  are  in  the  range  of  7000  to  28,  000  gates 
of  random  logic  per  chip,  on  a chip  with  40  to  200  pads.  Tor  the  mid  1980  time 
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Figure  6.2-1.  pPT  block  diagram. 
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period,  this  is  a low  risk  requirement.  In  determining  the  partitioning  of  the 
architecture,  the  gate  equivalent  circuits  were  kept  under  25,  000  gates. 

This  will  allow  for  undetermined  logic  increase  and  less  than  maximum  den- 
sity on  any  chip. 

The  requirement  of  8 MIPS  imposes  a clock  rate  determined  by  the 
average  number  of  minor  cycles  per  instruction.  Assuming  5 minor  cycles 
per  instruction,  the  clock  rate  would  be  40  MHz. 

The  delays  assumed  for  this  technology  are 

hem  Delay  (nanoseconds ) 

Gate  delay  0.  4 

RAM  access  8 

ROM  acces s 8 

Off  chip  connect  2 

Using  a 40  MHz  clock  and  0.  4 nsec  gate  delay,  62  gates  is  the  longest  allow- 
able chip  path  and  is  more  than  required.  Hence,  the  gate  level  design  will 
be  constrained  to  this  figure  for  all  single  cycle  operations. 

6.2.4  Partitioning 

The  p.PT  consists  of  9 chips: 

1.  An  arithmetic  chip  which  contains  the  128  general  registers, 
the  arithmetic -logic  unit  (ALU),  the  multiply  network,  and 
related  functional  units; 

2.  A sequencing  and  I/O  chip  which  contains  the  program  counter, 
the  interrupt  structure,  the  memory  accessing  hardware  and  the 
autonomous  I/O  interfaces; 

3.  Two  memory  chips,  each  containing  4096  words,  16  bits  each; 

4.  A microprogram  control  chip  which  contains  the  micropro- 
gram control  unit  for  the  entire  pPT. 

6.2.4.  1 The  Arithmetic  Chip 

The  architecture  of  the  Arithmevic  chip  is  shown  in  figure  6.2-2. 

Except  for  various  control  and  status  lines,  the  only  data  path  leading  off 
this  chip  is  the  Main  Bus.  Memory  data  will  be  transmitted  and  received 
over  the  Main  Bus. 
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Figure  6.2-2.  Register  level  diagram  of  arithmetic  chip. 

L 

The  chip  contains  two  internal  busses:  the  Arithmetic  Bus  (A-Bus) 
which  performs  most  register  transfers  on  the  chip,  and  the  Iteration  Counter 
Bus  (I-Bus)  which  allows  selection  of  inputs  to  the  Iteration  Counter  (I).  A 
group  of  128  16  bit  registers  is  provided  for  the  user  program,  and  offer  fast 
access  (1  cycle).  The  constant  read-only  memory  (CROM)  contains  certain 
constants  and  masks  necessary  in  the  implementation  of  the  instruction  set. 

The  CROM  is  addressed  by  the  microprogram.  The  CROM  is  16  bits  wide 
and  its  length  is  estimated  to  be  16  words. 

The  Iteration  Counter  is  an  8-bit  up-down  counter  used  in  the  implemen- 


tation of  iterative  instructions,  such  as  shifts,  block  moves,  and  division. 


The  ALU  has  two  16  bit  inputs  designated  A (left)  and  B (right).  The 
output  of  the  ALU  is  one  of  the  following  functions  of  A and  Br 


A 

A.  OR.  B 

-B 

B 

A.  AND.  B 

all  O-s 

A + B 

A.  XOR.  B 

all  1-s 

A - B 

A 

In  addition  to  the  16-bit  result,  the  ALU  detects  overflow  for  the  operations 


A + B,  A - B and  -B. 


The  Multiply  Network  performs  fully  parallel  multiplication  of  two 
16-bit  numbers  in  two  clock  cycles.  The  Multiply  Network  has  two  16- bit 
inputs  (multiplicand  and  multiplier)  and  two  16-bit  outputs  (most  and  least 
significant  words  of  the  result)  on  the  A-Bus.  The  instruction  buifer  register 
holds  the  instruction  currently  being  processed.  It  is  16  bits  wide  and  can 
be  loaded  from  the  V-register. 


The  Flag  generation  logic  network  monitors  the  value  on  the  A-Bus. 
Three  flags  are  generated:  A - Bus  = 0,  A - Bus  > 0 and  A - Bus  < 0.  All 
three  flags  are  used  by  the  microprogram  control  unit  (MCU)  for  branching 


6.  2. 4.  2 Sequencing  and  I/O  Chip 

The  sequencing  and  I/O  chip  is  shown  in  figure  6.2-3.  The  data  paths 
leading  off  this  chip  are:  1)  Main  Bus  (16  bits),  2)  Address  Bus  (13  bits), 

3)  Input  to  Track- Bus  (16  bits),  4)  Two-way  to  neighbors  (8  bits),  5)  Output 
to  Signal  Processor  (1  bit),  6)  Input  from  previous  Vector  Buffer  Controller 
(1  bit)  and  7)  Output  to  subsequent  Vector  Buffer  Controller  (1-bit). 

Internally  this  chip  contains  another  bus,  the  Program  Counter  Bus 
(PC- Bus),  which  allows  selection  of  inputs  to  the  Program  Counter  (PC). 

The  Memory  Address  Register  (MAR)  consists  of  a 13-bit  address  and  an 
"indirect"  bit.  These  correspond  to  the  13  LSB  and  the  MSB  of  a 16  bit  word 
respectively.  The  MAR  can  be  loaded  from  the  Main  Bus  only.  There  are 
three  sources  of  memory  addresses  in  the  (xPT:  The  Program  Counter, 

addresses  in  the  instruction  stream,  and  indirect  addresses. 
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THRESHOLD  I / ^ I OUTPUT 

CONTROL  REG  Pll  CONTROL 


Figure  6.2-3.  Sequencing  and  I/O  chip  functional  block  diagiam. 


one 


The  Index  register  is  13-bits  wide  and  is  used  to  hold  the  contents  of 
of  the  General  Registers  when  indexed  addressing  is  used.  The  Index 
Adder  is  used  for  every  memory  access,  when  the  address  originates  from 
the  MAI  (indexed  or  unindexed).  The  Index  Adder  can  produce  2 possible 
results:  INDEX  + MAR  or  MAR.  The  output,  of  the  Index  Adder  is  13  bits 
wide  end  can  be  transmitted  to  the  memory  chips  via  the  Address-Bus.  The 
Interrupt  Vector  (IV)  consists  of  a 7-bit  register  and  associated  logic.  The 
7-bit  Interrupt  Mast  register  (IM)  can  be  used  to  suppress  the  servicing  of 
any  interrupt  levels.  The  contents  of  the  IM  are  ANDed  with  the  IV  before 
any  further  decisions  are  made. 

The  Program  Counter  (PC)  is  a 13-bit  counter  loaded  from  the 
PC-Bus,  and  contains  the  address  of  the  next  instruction  to  be  executed. 
Addresses  from  the  PC  can  be  transmitted  to  the  memory  chips  over  the 
Address  Bus. 

There  are  7 Trap  Address  Cells  (TAC),  a 12  bit  wide  register  each 
corresponding  to  one  interrupt  level.  Each  TAC  contains  a memory  address, 
and  whenever  the  respective  level  of  interrupt  occurs,  interrupt  handling  is 
started  at  that  address. 

The  Interrupt  Stack  consists  of  7 registers  corresponding  to  7 nested 
interrupts.  Each  . egister  consists  of  a 12-bit  resun  ption  address  and  a 
3-bit  resumption  level.  The  Vector  Buffer  represents  the  I/O  interface  with 
the  Track-Bus,  and  contains  one  cell  of  storage  for  each  track.  Currently, 
it  appears  that  32  such  cells  will  be  sufficient.  Each  cell  will  be  composed 
of  a number  of  16 -bit  words. 

The  Threshold  Control  Register  (TCR)  is  parallel  input/serial  output 
organized,  and  provides  the  I/O  interface  between  the  pPT  and  the  corres- 
ponding Signal  Processor.  The  Message  Control  Network  (MCN)  together 
with  the  Incoming  Message  Register,  (IMR ) and  the  Outgoing  Message  Register 
(OMR)  form  the  I/O  interface  with  the  neighboring  pPT's. 


6. 2. 4.  3 Memory  Chip 

The  RAM  Memory  chip  has  been  discussed  in  Section  4.  1. 

6.  2. 4.  4 The  Microprogram  Control  Unit  (MCU) 

The  MCU  chip  is  shown  in  figure  6.  2-4.  The  following  data  paths 
lead  off  the  MCU  chip: 

1.  Encoded  Control  Signals  to  all  other  chips  of  the  |J.PT. 

2.  Status  Flags  from  all  other  chips  of  the  fj.PT. 


OP  CODE  FROM  IBR 


ENCODED 

CONTROL 

SIGNALS 


Figure  6.2-4.  The  microprogram  control  unit  register 

level  diagram. 


3.  Op-code  from  IBR  on  the  Arithmetic  Chip. 

4.  INT-flag  from  the  Sequencing  and  I/O  chip. 

H:e  MCU  consists  of  a 512  word  x 32  bit  ROM,  and  a Command 
Register  which  can  hold  one  word  from  the  ROM.  During  each  minor  cycle 
one  word  is  re  d from  the  ROM  and  placed  in  the  Command  Register. 

The  9-bit  address  of  the  next  ROM-word  to  be  fetched  can  come  from 
one  of  3 sources:  1)  8 bits  from  the  Next  Address  Field  of  the  Command 

Register,  concatenated  with  one  bit  based  on  a status  flag;  2)  6 bhs  from  the 
Op-code  field  of  the  IBR  with  three  zeros  as  MSB's;  or  3)  a hardwired  addres 
used  to  branch  into  a section  of  the  microprogram  dedicated  to  trapping 
interrupts. 

Selection  between  these  sources  is  made  by  the  Address  Multiplexor. 
The  selection  is  based  on  three  control  bits:  one  is  the  INT  signal  from  the 
Interrupt  Structure,  the  others  originate  from  the  Command  Register. 

6.2.5  Track  Processor  Sizing 

Given  the  above  architecture,  the  number  of  gate  equivalents  car.  be 
estimated  to  determine  the  number  of  chips  required  to  implement  this  [iPT, 
and  the  expected  power  dissipation.  Table  6.2-1  summarizes  the  gate 
equivalents  for  each  function.  The  data  from  this  table  is  further  simplified 
by  assuming  5 gate  equivalents  per  state  device,  i.  e.,  flip  flop. 

Further  assumptions  on  gate  equivalents  are:  1)  one  bit  of  ROM 

corresponds  to  one  gate,  and  2)  one  bit  of  RAM  corresponds  to  1.  5 gates. 
Using  a figure  of  loK  bits  per  memory  chip  allows  space  .or  control, 
address  decode  and  timing  logic  to  be  fabricated  on  the  same  chip. 

6.  2.  5.1  Arithmetic  Unit 

The  largest  portion  of  the  Arithmetic  Unit  is  the  multiply  network. 
The  multiply  network  is  fully  parallel,  and  based  on  common  algorithms  is 


6-11 


" 

— * -j 


..  ..  _ 


TABLE  6.  2-  1.  GATE  EQUIVALENTS 


No.  State 

Unit 

No.  Bits 

No.  Gates 

Devices 

ALU 

Adder 


CLA 

(Carry- 

Lookahead 

Adder) 


Compa:'  ator 

Counter 

MUX 

MUX 

MUX 

MUX 


Prio  rity 
Encoder 


Register 

File 


Register 
-Parallel  I/O 
-Serial  I/O 


Registe  • 
-Parallel  I/O 


Regi  ster 
-Serial  In 
-Parallel  Out 


Register 
- Parallel  In 
-Serial  Out 


4 

4 

4 Units 


2:1,  Quad 
4:1,  Dual 


i 


The  ..-emainder  of 


Li 


estimat'd  to  be  on  the  order  of  18,  000  gate  equivalents, 
the  logic  sizing  is  as  follows: 

1.  Registers:  A,  B,  M,  N,  U,  V,  X (16  Bits);  I (8  Bits) 

a.  (7RegS)(l6^|)(^|)[27^+(4^)(5^) 


b.  (iReg)  (*j^)(Hii&)[«fe5r+  (4Mi)(5lKr) 


= 1316  g.  e. 


94  g.  e. 


2.  ALU/C-LA: 

<«  (63  Mt) + <4  Units>  (ttS£)  (19f&) *275  e 


. e. 


CROM: 


(16^)<16  w°rds>  i'jf) 


= 256  g. e. 


4.  General  Registers  : 

(128  Words)  ( 16  (l.S^f-’)  - 3072  g. 

5.  Flag  generation  logic  (comparator) 


(ttse)  (31  Mr)  - 124  e-e- 


6.  Miscellaneous  logic,  including  I - 0,  QB,  Command  Decode,  etc.,  is 

estimated  at  10%. 
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0. 

19000 

1. 

1410 

2. 

275 

O 

256 

4. 

3072 

5. 

124 

24137 

6. 

2413  (10%) 

26,  55  0 gate  equ 

6. 2. 5. 2 

Sequencing  and 

1.  Registers:  INDEX  (13),  MAR  (16),  IV  (7),  MASK  (7),  TC  (16), 
OMR  (16),  IMR  (16) 

a.  16  Bit  PI/SO:  (TC,  OMR) 


(2  *•*•>  If)  (™)  I16  fes + (4  osr)  (5  fcr) 

b.  16  Bit  SI/ PO:  (IMR) 


<»*•«>  t£f)(fw) 

c.  16  Bit  PI/PO:  (MAR) 

(1  ^ » /,  / Bits  \ / 1 Unit  \ 

Res>  (lbR^I\T^iu) 

d.  13  Reg  PI/PO:  (INDEX) 


4 Mt  + (8  4fr)(5  frfr)! 


28 ! rat + (4  car)  (5  frrr) 


dReg)  (13 ft) (Hir)  28uht  + (4M)  (5frr) 


= 288  g.  e. 


88  g.  e. 


= 192  g.  e. 


= 156  g. e. 
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2. 


3. 


e.  7 Bit  PI/PO:  (IV,  MASK) 


<2  R£s>  (7  i?)  (™)  [28  Mi  + (4  rat)  6 Mf) 


Counters:  PC  (12),  TS  (3) 
a.  PC: 


(1  CTR) 


/ Bits  \ (\ Unit]  / f.f.  \ / g.e.\ 

\ CTR/  \4  Bits/  Unit  + \4  Unit  / \5  UMtj 


b,  TS: 


(1  CTR> 


L Bits  \ / 1 Unit  \ 
\ CTR  / \4  Bits  / 


28  fest + (4  ninr)  (5  Tfelt-) 


Index  Adder ; 


156  g. e. 


4.  Interrupt  Stack: 


(7  Words)  (l6-r^ts  ) ( /■  ) 38  + (l6— **•  \(5-°-e-  \ 

\ Adder/  \4x4  Bits/  Unit  T \ Unit  / \ f.f.  / 


5. 


Encoder : 


(1  Enc) 


1 Unit 
8 to  3 


g-e-\ 

Unit/ 


29  g.  e. 


b.  Trap  Address  Cells: 


(7  Words ) 


(\  ■?  Bits  \ / 1 Unit  \ 
\ Word  / \4x4  Bits/ 


38 


6* 

Unit 


(Ibmk  )(5w) 


168  g.  e. 


= 144  g.  e 


= 36  g. e. 


= 826  g.  e 


a 620  g.  e. 
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7.  Message  Control  Network:  a 5000  g.e. 

8.  Bus  Control:  Main  = 16  g.e. 

PC  = (12)  (4)  = 48  g.e. 

Address  = (13)  (2)  = 26  g.e. 

9.  Vector  Buffer: 

a.  RAM:  (n  = 8)  (256)  (16)  (1.5)  = 6144  g.e. 

b.  Controller:  a 450  g.e. 

10.  Output  Control:  250  g.e. 

11.  Miscellaneous  Logic  = 10% 

1.  892 

2.  180 

3.  156 

4.  826 

5.  29 

6 . 620 

7.  5000 

8.  90 

9.  6594 

10,  250 
14637 

11.  1463  (10%) 

16,  100  g.e.  /SEQ  and  I/O  UMT 

6.  2. 5.  3 Memory  Unit 

The  assumed  organization  is  4K  words  by  16  bits.  Because  of  the 
large  amount  of  RAM  compared  to  the  logic  on  the  chip,  an  estimate  of 
80K  g.e.  for  the  memory  area  v.  as  vised.  This  is  due  primarily  to  the  uni- 
formity of  the  memory  versus  a random  logic  array.  The  logic  required  on 
the  chip  for  memory  related  functions  and  the  switching  controller  brings  the 
memory  chip  to  a maximum  of  about  82,  000  g.  e. 


J 


11 

I 


krdm. 


6. 2.  5.  4 Micro  Program  Control  Unit 

The  bulk  of  the  logic  in  the  MCU  consists  of  the  ROM  and  the  Command 
Register  (CR).  The  remaining  logic  and  nardware  such  as  the  next  address 
mux,  flag  select  mux,  etc.,  will  be  treated  as  a small  percentage  of 
the  ROM  and  CR. 

ROM:  (5  12  words)  (32  Bits/Word)  (1  g.e. /Bit)  = 16,  384  g.  e. 


CR:  (32  Bits) 


27  8‘  e-  a. 

Unit  + 


(4tjs)  (5^) 


376  g. e. 


Thus  we  have  16,  760  gate  equivalents  for  the  ROM  and  Control  Regis- 
ter. Allowing  a 15%  expansion  factor  gives  approximately  19,  000  gate 
equivalents  for  the  MCU. 


6.  2.  6 Off  Chip  Connections 


Another  consideration  in  designing  and  partitioning  the  CPU  and 
memory  was  the  number  of  pads  on  each  chip.  The  feasible  maximum  for 
this  figure  is  considered  about  200. 


1.  Arithmetic  Unit 

Inputs 

Command 
Main  Bus 
Powe  r / Ground 
Clock 


Pads 

8 (Avg) 
16 
2 
1 


Outputs 


5 

_6 

38  Pads 


Flags 

OPCODE 


Total: 


2.  Sequencing  and  I/O 


Pads 


Inputs 


Interrupts 

7 

Command 

8 (Avg) 

Main  Bus 

16 

Output  Control 

1 

Message  Link 

8 

Buffer  Controller 

1 

Power/  Ground 

2 

Clock 

1 

X 

Outputs 

Address  Bus 

13 

Buffer  Output 

16 

Buffer  Controller 

1 

Flags 

5 

Total: 

79  Pads 

Memory  Unit 

Inputs 

Pnds 

Parallel  Data 

16 

Serial  Data 

1 

S/P  Select 

1 

R/W  Select 

1 

Enable 

1 

I.  D. 

3 

Address 

13 

Count 

1 

AT 

1 

AT  Reset 

/ 

1 

Power  / Ground 

?. 

Clock 

1 
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Outputs 

Main  Bus 
Request  Complete 


Pads 

16 

1 


Total:  59  Pads 


4.  Microprogram  Control  Unit 


Inputs 

Status  Flags 
OPCODE 
Interrupt 
Powe  r / Ground 
Clock 

Outputs 

Control  Signals 


6.2.7  Power  Requirements 
Arithmetic  Unit 


Pads 

16  (Max) 
6 
1 
2 
1 

18 

Total:  44  Pads 


Static:  (25,  000  g.  e.  ) (2 


nW 

g.  e. 


= 0.  05  mW 


Dynamic : (475  0 g.  e.  ) (40  MHz ) (0.  15  p.T/g,  e,  ) = 2 8.  5 mW 
Output  Devices:  13.45  mW 


Total:-  42  mW 


Sequencing  and  I/O 
Static:  (16,  000  g 


■ e.)  fz-SS-') 

v s-y 


= 0.  03  mW 


Dynamic : (6770  g.  e.  ) (40  MHz ) (0.  15  pj/g.e.) 
Output  Devices:  lz.  35  mW 


40.  62  mW 


T otal:~  5 3 mW 
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Memory: 

Static:  (20,  000  g.  e. 


, 2 — - I = 0.  04  mW 


g.  e. 


Dynamic:  (2500  g.  e.  ) (8  MHz ) (0.  15  pJ/g.  e,  ) 
Output  Devices:  12  mW 

Tota;jy=^J=J2^ 

Microprogram  Control  Unit: 


= 3 mW 


Static:  (20,  000  g.  e.  ) I 2 


nW 


g.  e. 


= 0.  04  mW 


Dynamic:  (2000  g.  e.  ) (40  MHz)  (0.  15  pJ/g.  e.  ) = 12  mW 
Output  Devices:  9 mW 

Total:~21  mW 


Since  the  complete  pPT  will  consist  of  the  three  CPU  chips  along 
with  2 memory  chips,  power  consumption  totals  approximately  146  mW. 


MCU 

21  mW 

ARITH 

42 

SEQ  and  I/O 

53 

MEM  (2) 

30 

Total: 


146  mW 


This  assumes  the  processor  is  assembled  in  a large  area  hybrid, 
utilizing  a dielectric  substrate  and  lew  capacitance  interconnects. 
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7.0  CUSTOM  LOGIC  CHIP  SUMMARY  AND  SCHEDULES 


Preliminary  design  has  been  provided  for  the  ASE,  |YPT,  Memory 
(SPS  and  RAM)  chip  concepts  and  the  basic  CMOS/SOS  logic  device  design 
has  been  examined.  The  risks  and  requirements  of  Ultra  High  Density  LSI 
using  E-Beam  Lithography  have  been  presented.  In  general  the  primary 
areas  requiring  development  are:  (1)  E-Beam  Lithography,  (2)  CMOS/SOS 
small  device  process  development,  (3)  threshold  voltage  uniformity, 

(4)  interconnect  techniques  and  logic  organization  for  ultra-high  density  LSI. 

Figure  7-1  illustrates  a recommended  30  month  process  development 
program  designed  to  provide  a demonstration  of  E-Beam  CMOS/SOS  ring 
oscillator  (0.  01  pJ,  0.  2 psec  gate  delay)  in  4>1,  and  output  devices  and 

SSI  devices  intended  to  show  LSI  compatibility  in  <{>2. 

Table  7-1  provides  an  estimate  of  the  development  schedule  for  all 
APSP  critical  device  development  major  tasks.  It  is  anticipated  that  by  the 
second  quarter  of  1981  all  brassboard  demonstration  chips  required  for  the 
APSP  can  be  designed,  processed  and  tested. 
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0 


12 


IB 


24 


30 
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E BEAM  MASKS 

• E BEAM 
TECHNOLOGY 

• PROCESS 
TRADEOFFS 

• ALIGNMENT 

• PROJECTION 
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HIGH  P 

• CONDUCTOR 
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IAL,  INTER 
CONNECTS 


FAB  TEST 
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PROCESS 

ITERATIONS 
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PROCESS 

REFINEMENT 


• PROCESS 
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MEAS, 
ItAKAGE 

• OESIGN  RULES 

• SPEED,  POWER 
OEMO 


DESIGN 

FUNCTIONS 
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DEMO 


• LOGIC  SSI 
DEFINITION 
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OEVICES 
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• 2mm  x 2 mm 
CHIP 

• LSI  PROJECTION 
VS  OIRECT 
EXPOSURE 
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FAB  & TEST 
FUNCTIONAL 
DEVICES 

OUTPUT  , 
DEVICE  & 
SSI  DEMO 

• EVALUATIONS 
& RECOMMEN 
DATIONS 

• FUNCTIONS 

- SPEED 

- POWER 

- DENSITY 

- OUTPUT 
DEVICE 
SIZES, 

POWER, 

SPECIAL 

• YIELD 
PREDICTIONS 

• LSI 

CONFIGURATION 

^PERFORMANCE 

PREDICTIONS 


Figure  7-1.  E-Beam  CMOS/SOS  microfabrication 
process  development  program. 


TABLE  7-1.  APSP  ESTIMATED  OVERALL  CUSTOM 
LOGIC  CHIP  SCHEDULES 

(Assuming  April  1,  1976  start) 
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V„|.  DEVELOPMENT  PLAN 


SECTION  VIII 
DEVELOPMENT  PLAN 

DEVELOPMENT  AND  TEST  PLAN  FOR  AN  ADAPTIVE  PROGRAMMABLE 
SIGNAL  PROCESSOR 

This  section  describes  a plan  to  design,  fabricate  and  test  a feasibility- 
demonstration  model  of  an  Adaptive  Programmable  Signal  Processor.  The 
plan  (CDRL  A009)  provides  a program  having  two  phases,  eight  major  sub- 
tasks, and  a thirty-three  month  duration.  Included  are  development  of 
hardware,  firmware,  software,  test  equipment  and  critical  technologies. 

The  primary  objectives  of  the  plan  are  to: 

• Define  a modular  adaptive  processor  having  the  f lexibility  and 
programmability  needed  to  provide  a multi-mission  capability 
for  both  surveillance  and  commanded  search  modes. 

• Design  and  construct  a breadboard  adaptive  programmable 
processor  element  which  when  supplied  suitable  computer- 
simulated  inputs,  will  be  able  to  provide  the  performance  needed 
for  target  detection  and  tracking. 

• Develop  the  technology  required  for  ultra-high-density  LSI 
(UHD-LSI)  Circuitry  in  order  to  verify  that  the  final  design  is 
capable  of  meeting  the  desired  performance,  density  and  power 
requirements. 


STATEMENT  OF  WORK 

The  statement  of  work  for  the  adaptive  programmable  signal  processor 
program  consists  the  eight  major  tasks  diagrammed  and  costed  in  Fig- 
ure 1.  The  performance  specifications  for  this  processor  are  listed  in 

Table  1. 

The  contractor  is  to  provide  personnel,  materials,  and  facilities  with 
the  objective  to  complete  the  following  development  and  demonstration  tasks. 


i i 
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Figure  1.  Adaptive  programmable  signal  processor  development  schedule. 
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TABLE  1.  DEMONSTRATION  ADAPTIVE  PROGRAMMABLE 
PROCESSOR  SPECIFICATIONS 


Evaluated  Parameters  During  the  Program 


Specifications 


Modes  of  operation 

Focal  plane  assembly  output  data  format 


Frame  rate 


Output  data  format 


Submodule  processor  output  bit  rate 
Overall  data  compression 

System  input  dynamic  range  with  adaptive  control 
Signal  dynamic  range 
Nuclear  event  detection  and  erasure 
A/D  conversion  accuracy 
Gain  normalization 

A/D  conversion  rate  (serial  data) 

Clutter  rejection  capability  in  BTH  mode 

• Temporal  filter 

• Pixel  space  spatial  filter 

• Walsh-Hadama  rd  filter 

Star  rejection  in  ATH  mode 

Tracking  accuracy  ( 1 cr  error) 

Number  of  simultaneous  tracks  (false 
and  real, 

Throughput  of  track  processor 

Track  Processor  Features 

Ability  to  track  when  target  moves  from 
one  detector /multiplexer  array  to  another 

Multi-target  crossing  tracks 

Track  initiation  parameters;  velocity, 
acceleration,  and  number  of  consecutive  hits 

Track  deletion  parameters;  number  of  missed 
hits,  velocity,  and  ac  . iteration 


ATH  and  BTH 

1)  Spatial  domain 

2)  Walsh-HaUamard  transform 
domain 


State  vectors  of  maximum 
128 -bit  word  length 

200  state  vectors  per  second 

26  Kbps 

256 

10? 

1 04 

Yes 

1 0 bits 

On  individual  detector  element 
basis  with  an  accuracy  of 
better  than  1 percent 

1 64  kHz 

40  to  60  db 
10  to  20  db 
20  to  30  db 
1 00  percent 
0.  25  pixel 

50  per  track  sub-module 
processor  \200  total) 

2 MIPS  for  each  submodule 
processor,  8 MIPS  total 


Task  1 - Processor  definition 


Provide  a precise  definition  of  two  versions  of  an  adaptive  program- 
mable signal  processor  configuration  which  will  meet  the  requirements 
summarized  in  Table  1,  assuming  limited  spatial  and  Walsh- Hadamard  trans- 
form domain  alalog  signal  preprocessing  in  the  focal  p’ane  assembly.  Per- 
form an  evaluation,  and  select  one  of  the  two  versions  for  further  develop- 
ment. An  adaptive  video  encoder,  a temporal  filter,  adaptive  detection 
logic,  and  a programmable  track  processor  will  follow  the  readout  device 
on  the  focal  plane.  The  processor  will  contain  four  identical  channels  for 
processing  signals  from  four  identical  detector /multiplexer  array  chips. 

The  processors  will  interface  with  a module  processor  in  a test  and  control 
unit.  Development  of  data  processing  algorithims,  identification  of  the 
processor /focal  plane  interface,  and  definition  of  the  test  interfaces  are 
included  in  this  task.  It  is  assumed  that  preliminary  definitions  of  target 
and  background  clutter  will  be  customer- supplied  within  two  months  after 
program  go-ahead. 

Task  2 — Performance  Analysis 

On  me  basis  of  the  processor  definition  reached  during  Task  1 above, 
with  the  aid  of  computer  models,  analyze  the  performance  capabilities  of 
the  processor  in  the  presence  of  the  nomine  al  target  and  clutter  scenes. 

These  analyses  will  include  signal-to-noise  ratio,  signal-to-clutter  ratio, 
detection,  and  false  alarm  probability  as  well  as  tracking  errors  in  the 
presence  of  multiple  crossing  tracks.  It  is  assumed  that  the  adaptive  thresh- 
old will  be  adjusted  to  accept  no  more  than  200  real  and  false  targets  at  a 
time,  for  each  submodule  processor.  The  modular  concept  proposed  permits 
arraying  of  multiple  submodule  processors,  to  achieve  a several  thousand 

target  total  capability. 

Task  3 — Demonstration  Processor 

The  demonstration  processor  will  be  designed  by  using  (1)  off-the- 
shelf  MSI  components,  (2)  special-purpose  A/D-D/A  CCD/MOS  components 
and  (3)  a CMOS/SOS  digital  logic  chip  in  the  AVE  section.  This  task  will 
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include  design  effort  for  support  electronics  and  the  special  purpose  chip 
test  hardware  and  software. 

A.  Special  -Purpose  A/D  - D/A  Chip 

Design,  build,  and  test  an  adaptive  video  encoder  with  the  A/D  and 
D/A  elements,  amplifiers,  sample-and  hold  devices,  CCD  arithmetic  func- 
tions, and  interface  devices  on  the  same  LSI  chip.  The  processes  and 
design  rules  needed  for  the  CCD— cornpatible  ClvlOS  clock  drivers  and  logic 
circuits  will  be  developed.  The  design  will  provide  optimum  conversion 
accuracy,  low  power,  maximum  linearity,  low  noise,  minimum  geometry, 
and  complex  functional  integration  and  also  maintain  component  independence 
to  ensure  flexibility. 

B.  CMQS/SQS  Digital  Logic  Chip 

Design,  build,  and  test  a mclear  event  detection  and  dynamic  range 
selection  logic  chip  for  the  AVE.  This  chip  will  include  approximately 
2500  CMOS/SOS  devices  optimized  for  small  geometry,  low  power,  and  high 
speed  LSI.  The  design  parameters  are  0.  8pj,  4-volt  logic  levels,  and 
3-nsee  gate  propagation  delays. 

The  device  will  be  tested  to  determine  the  maximum  clock  frequency, 
noise  immunity,  power  dissipation,  and  the  operation  of  the  overall  logic. 

The  processes  and  design  rules  for  very  small-geometry,  low-voltage  CMOS/ 
SOS  devices  and  LSI  interconnects  will  also  be  developed  as  part  of  this  sub- 
task.  All  chip  design  effort  will  include  identification  of  fault  tolerance, 
nuclear  hardening  and  testability  requirements. 

Task  4 — Design  and  Testing  of  Firmware 

A set  of  processor  firmware  will  be  implemented  on  the  basis  of  the 
instruction  set  selected,  and  will  be  designed  in  detail  and  'cested.  The 
design  will  include  flow  charts  and  microcoding  as  well  as  a microprogram 
simujator.  The  identification,  design  and  development  of  th**  facilities 
required  for  firmware  development  and  test  are  included  in  this  task. 
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Task  5 — Fabrication  and  Test  of  Processor 


I he  demonstration  processor  will  be  checked  by  using  standard 
computer-aided  logic  test  programs  and  will  be  assembled  on  wire-wrapped 
boards.  The  volume  of  a given  submodule  must  not  exceed  1 ft^.  After 
unit  assembly  and  test,  the  submodule  will  be  integrated  and  tested  by  apply- 
ing analog- simulated  input  signals  and  observing  digital-output  state  vectors 
to  the  module  processor.  The  design  will  be  evaluated  by  comparing  the 
performance  parameters  listed  in  Table  1 to  the  design  goals  and  determin- 
ing the  limitations.  Included  in  this  task  is  the  identification  of  the  hardware 
and  software  required  for  processor  evaluation  and  test. 

Task  6 — Special  Test  Equipment 

The  special  test  equipment  identified  in  Task  5 will  be  developed 
and  used  to  test  the  submodule  processor  and  its  various  units. 

Task  7 — Software  Development 

Two  major  programs  will  be  developed  and  tested  in  the  submodule; 
the  macro  assembler  and  the  tactical  software.  The  tactical  software  will 
be  a detailed  implementation  of  the  track  initiation,  tracking,  and  track 
deletion  algorithms  for  both  the  Belo /'-the-Horizon  mode  and  the  Above-the- 
Ho rizon  mode . 

Task  8 Critical  Technology  Development 
A.  E-Beam  CMOS/SOS 

Design,  fabricate,  and  test  one  LSI-compatible  CMOS/SOS  ’ogic 
function  by  using  electron  beam  lithography.  Develop  an  E-neam  micro- 
fabrication  process  and  a compatible  wafer  process.  The  design  objOctb.  es 
are  0.  01  pj  per  gate,  with  a 0.  2-nsec  gate  propagation  Jela>  for  2 -volt  ring 
oscillators,  and  0.  lr  pj  per  gate,  with  a 0.  6-nsec  gate  rrorugatior  delay 
with  1000  gates /mm  equivalent  LSI  density  for  2-volt  logic  devices  The 
design  rules,  processes,  their  limitations,  the  in  try  connect  design  ru1  s, 
their  characteristics,  and  limitations  will  also  he  deve’-.ved. 
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B.  D-EC L Arithmetic  Chip 

Design,  build,  and  test  a D-ECL  8 by  8-bit  multiplier.  Make  the 
best  use  of  minimum-geometry  devices  and  interconnects  available  by  means 
of  high-resolution  photolithography  and  thereby  achieve  lower  power  con- 
sumption and  high  speed.  The  design  objective  for  an  overall  multiplier 
delay  is  5 to  7 nsec  with  total  power  dissipation  of  1 to  1.  5 watts.  This 
corresponds  to  an  equivalent  internal  logic  gate-  delay  of  approximately 
100  psec  and  an  internal  equivalent  logic  gate  power  dissipation  of  1 mw  and 
an  internal  equivalent  logic  gate  power-delay  product  of  0.  1 pj. 

In  the  course  of  the  multiplier  development,  design  rules  for  very 
small  geometry  D-ECL  LSI  devices  and  interconnects  all  will  be  developed. 
These  rules  will  be  available  to  use  with  subsequent  design  and  fabrication 
of  LSI  components.  Sample  quantities  of  the  multipliers  built  as  part  of 
this  task  will  be  tested  to  verify  that  their  logic  operates  properly  and  to 
determine  maximum  operating  throughput  rate  and  amount  of  power 
dissipated. 

The  development  of  certain  fabrication  hardware  unique  to  the  above 
technologies  also  constitutes  part  of  this  task. 
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